Support Resources

LucidWorks Forum
KnowledgeBase

LucidWorks Search v2.5

PDF Versions

Older Versions

LucidWorks 2.1
LucidWorks 2.0
LucidWorks 1.8

This is the documentation for the LucidWorks Search v2.5, the latest release. Go here for LucidWorks 2.1.

Skip to end of metadata
Go to start of metadata

The $LWE_HOME/app/examples/perl directory contains utilities demonstrating many of the LucidWorks REST API features from Perl code. These utilities can be used to assist people in managing their LucidWorks Search installation, or as an example of how to write Perl code as part of customer applications that will interact with LucidWorks and Solr.

Dependencies

All of these tools require that the "LWP" and "JSON" Perl modules be installed.

All of these tools assume that the main URL for LucidWorks is "http://localhost:8888" and that the URL for the UI is "http://localhost:8989"

If LucidWorks is running elsewhere, please set the LWE_URL and LWE_UI_URL Environment variables appropriately in the shell where you will be using these tools.

With the exception of "collections.pl" (which deals with multiple collections) all of these tools work with "collection1" by default. To use a different collection, please set the LWE_COLLECTION Environment variable appropriately in the shell where you will be using these tools.

Basic Usage

All of these tools can be run without any arguments to see "help" information about their usage.

View, Create, Modify, or Delete Collections

collections.pl show
collections.pl show name=collection1
collections.pl create name=products instance_dir=prod_dir
collections.pl delete name=products

Get Basic Information About the Collection

info.pl show
info.pl show index_num_docs index_size free_disk_space

View or Modify Settings

settings.pl show
settings.pl show boost_recent stopword_list
settings.pl update boost_recent=false stopword_list=a stopword_list=an stopword_list=the

(Note that you can create a list by specifying the same setting key multiple times.)

View, Create, Modify, or Delete Data Sources

ds.pl show
ds.pl show id=74
ds.pl show name=simple
ds.pl create name=simple type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html
ds.pl create name=docs type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100 
ds.pl update id=74 crawl_depth=999
ds.pl update name=simple crawl_depth=999
ds.pl update id=74 name=new_name crawl_depth=999
ds.pl delete id=74
ds.pl delete name=simple
ds.pl delete-all YES YES YES

Manually Start or Stop a Data Source Job

ds.pl start id=74
ds.pl start name=simple
ds.pl stop id=74
ds.pl stop name=simple

Modify the Schedule of an Existing Data Source

ds.pl schedule id=74 active=true period=60 start_time=2076-03-06T12:34:56-0800
ds.pl schedule id=74 active=true period=60 start_time=now
ds.pl schedule name=simple active=true period=60 start_time=now

View the Status and Indexing History of Existing Data Sources

ds.pl status
ds.pl status id=74
ds.pl status name=simple
ds.pl history id=74
ds.pl history name=simple

View, Create, Modify, Check, or Delete Alerts

alerts.pl show
alerts.pl show username=bob
alerts.pl show id=68
alerts.pl create username=bob query=gnome name=gnome_alert
alerts.pl update id=68 period=5
alerts.pl check id=68
alerts.pl delete id=68

View, Create, Modify, or Delete Activities

activities.pl show
activities.pl show id=68
activities.pl create type=click active=true period=60 start_time=2076-03-06T12:34:56-0800
activities.pl create type=click active=true period=60 start_time=now
activities.pl update id=68 active=true period=300
activities.pl delete id=68

View the Status and History of Existing Activities

activities.pl status
activities.pl status id=68
activities.pl history id=68

View, Create, Modify, or Delete Fields

fields.pl show
fields.pl show name=mimeType
fields.pl create name=category field_type=string facet=true
fields.pl update name=category search_by_default=true
fields.pl delete name=category

View, Create, Modify, or Delete Users

users.pl show
users.pl show username=admin
users.pl create username=jim authorization=admin password=jimpass
users.pl update username=jim authorization=search
users.pl delete username=jim

Modify Roles

roles.pl show
roles.pl show name=DEFAULT
roles.pl create name=SECRET users=hank users=sam filters=status:secret
roles.pl update name=DEFAULT filters=status:public
roles.pl append name=SECRET users=jim users=bob groups=executives
roles.pl delete name=SECRET users=hank
roles.pl delete name=OLD

Pause or Resume All Background Jobs

maintenance.pl pause
maintenance.pl pause force
maintenance.pl resume ds=5 ds_sched=5 ds_sched=7 act_sched=9

Execute Searches with Optional Filters

search.pl "gtk gnome"
search.pl "gtk -gnome"
search.pl "+gtk +gnome" "mimeType:text/html"

Back to Top

Recipes

Indexing Data Sources

  1. Start up LucidWorks
  2. Create a data source using files on the same server as LucidWorks:
    ds.pl create name=localdocs type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100
    
  3. Schedule the "localdocs" data source to be indexed every 30 minutes starting now:
    ds.pl schedule name=localdocs active=true period=1800 start_time=now
    
  4. Create a data source using a remote HTTP server:
    ds.pl create name=solrwiki type=web crawler=lucid.aperture url=http://wiki.apache.org/solr/ crawl_depth=1
    
  5. Run the "solrwiki" data source once right now:
    ds.pl start name=solrwiki
    
  6. Periodically check the status of your data sources to see when the initial indexing is done (look for "crawl_state"):
    ds.pl status
    
  7. Execute some searches in your browser: http://localhost:8989/collections/collection1/search?search%5Bq%5D=configuration
  8. Searches can also be executed via the REST API using search.pl:
    search.pl configuration
    

Indexing and Activating Filters for Certain Users

  1. Start LucidWorks
  2. Create a new role HTML_ONLY to restrict some users and groups to only searching for HTML documents
    roles.pl create name=HTML_ONLY filters=mimeType:text/html
    
  3. Create a new search user named jim:
    users.pl create username=jim password=jimpass authorization=search
    
  4. Add "jim" to the list of users with the HTML_ONLY role:
    roles.pl append name=HTML_ONLY users=jim
    
  5. Create a data source of a directory containing HTML files as well as other plain text files:
    ds.pl create name=simple type=file crawler=lucid.aperture  path=/usr/share/gtk-doc/html crawl_depth=100
    
  6. Run the data source once right now:
    ds.pl start name=simple
    
  7. periodically check the 'status' of your data source to see when the initial indexing is done (look for "crawl_state"):
    ds.pl status name=simple
    
  8. Use your browser to login as the "jim" (with password "jimpass") and execute a search: http://localhost:8989/collections/collection1/search?search%5Bq%5D=
  9. As you execute various searches you should only see HTML documents (note the "Type" Facet in the right hand navigation column)
  10. Click the "Sign Out" link in the upper-right corner of search pages and Log in again as the "admin" user: http://localhost:8989/users/sign_out
  11. Execute the same searches as before: http://localhost:8989/collections/collection1/search?search%5Bq%5D=
  12. As you execute various searches you should now see all documents (note the "Type" Facet in the right hand navigation column)

Indexing and Activating Click Boosting

  1. Start LucidWorks
  2. Update your settings to enable click tracking:
    settings.pl update click_enabled=true
    
  3. Create a data source:
    ds.pl create name=local_click_ds type=file crawler=lucid.aperture  path=/usr/share/gtk-doc/html crawl_depth=100
    
  4. Schedule the data source to be indexed every 30 minutes starting now:
    ds.pl schedule name=local_click_ds active=true period=1800 start_time=now
    
  5. Schedule the click processing activity to run every 10 minutes:
    activities.pl create type=click active=true period=600 start_time=now
    
  6. periodically check the 'status' of your data source to see when the initial indexing is done (look for "crawl_state"):
    ds.pl status name=local_click_ds
    
  7. Execute a search in your browser: http://localhost:8989/collections/collection1/search?search%5Bq%5D=gnome
  8. As you execute searches and click on results, you should see the documents you click on filter up to the top of those searches as the click processing activity runs every 10 minutes.

Pause and Resume All Background Jobs for Maintenance

  1. Start LucidWorks
  2. Update your settings to enable click tracking:
    settings.pl update click_enabled=true
    
  3. Create a data source:
    ds.pl create name=local_click_ds type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100
    
  4. Schedule the data source to be indexed every 30 minutes starting now:
    ds.pl schedule name=local_click_ds active=true period=1800 start_time=now
    
  5. Schedule the click processing activity to run every 10 minutes:
    activities.pl create type=click active=true period=600 start_time=now
    
  6. Pause all active data source schedules and activities, blocking until any currently running data sources and activities are finished:
    maintenance.pl pause
    

    This command should output something like the following:

    $ maintenance.pl pause
    De-Activating activity #9: http://localhost:8888/api/collections/collection1/activities/9
    De-Activating schedule of ds#5: http://localhost:8888/api/collections/collection1/datasources/5/schedule
    Waiting for any currently running Activities to finish...
    ...Done!
    Waiting for any currently running DataSources to finish...
    ...Done!
    Run this command to resume everything that was de-activated...
    maintenance.pl resume activity=9 ds=5
    
  7. Perform whatever maintenance is needed.
  8. When you are ready, run the command mentioned in the output of the "Pause" step to resume scheduled data source and activity processing:
    maintenance.pl resume activity=9 ds=5
    
  9. Your data source and click activity will now continue to run on the previously specified schedules.

Back to Top

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.