Newer Versions

v2.1
v2.0
v1.8
v1.7
v1.6

LucidWorks Enterprise v1.5

Other Resources

Lucid Forums

This is the documentation for LucidWorks Enterprise v1.5. The latest version is 2.1

Skip to end of metadata
Go to start of metadata

The examples/perl directory contains utilities demonstrating many of the LucidWorks REST API features from Perl code. These utilities can be used to assist people in managing their LucidWorks installation, or as an example of how to write Perl code as part of customer applications that will interact with LucidWorks and Solr.

Dependencies

All of these tools require that the "JSON" Perl module be installed.

All of these tools assume that the main URL for LWE is "http://localhost:8888" If LWE is running elsewhere, please set the LWE_URL Environment variable appropriately in the shell where you will be using these tools.

Basic Usage

All of these tools can be run without any arguments to see "help" info about their usage.

Get Some basic Info about the running instance of LWE...

collection.pl show
collection.pl show index_num_docs index_size free_disk_space

View, Modify Settings...

settings.pl show
settings.pl show boost_recent stopword_list
settings.pl update boost_recent=false stopword_list=a stopword_list=an stopword_list=the

(note that creating a list is done by specifying the same setting key multiple times)

View, Create, Modify, Delete Data Sources...

ds.pl show
ds.pl show id=74
ds.pl show name=simple
ds.pl create name=simple type=FileSystemDataSource path=/usr/share/gtk-doc/html
ds.pl create name=docs type=FileSystemDataSource path=/usr/share/gtk-doc/html crawl_depth=100 include_subdirectories=true
ds.pl update id=74 crawl_depth=999
ds.pl update name=simple crawl_depth=999
ds.pl update id=74 name=new_name crawl_depth=999
ds.pl delete id=74  
ds.pl delete name=simple
ds.pl delete-all YES YES YES

Modify the schedule of an existing Data Source...

ds.pl schedule id=74 active=true period=60 start_time=2076-03-06T12:34:56-0800
ds.pl schedule id=74 active=true period=60 start_time=now 
ds.pl schedule name=simple active=true period=60 start_time=now 

View the status and indexing history of existing Data Sources...

ds.pl status
ds.pl status id=74
ds.pl status name=simple
ds.pl history id=74
ds.pl history name=simple

View, Create, Modify, Check, Delete Alerts...

alerts.pl show username=bob
alerts.pl show username=bob id=68
alerts.pl create username=bob query=gnome name=gnome_alert
alerts.pl update username=bob id=68 update_interval=5
alerts.pl check username=bob id=68
alerts.pl delete username=bob id=68

View, Create, Modify, Delete Activities...

activities.pl show
activities.pl show id=68
activities.pl create type=click active=true period=60 start_time=2076-03-06T12:34:56-0800 
activities.pl create type=click active=true period=60 start_time=now
activities.pl update id=68 active=true period=300
activities.pl delete id=68

View the status and history of existing Activities...

activities.pl status
activities.pl status id=68
activities.pl history id=68

View, Create, Modify, Delete Fields...

fields.pl show
fields.pl show name=mimeType
fields.pl create name=category field_type=string facet=true
fields.pl update name=category search_by_default=true
fields.pl delete name=category

View, Create, Modify, Delete Users...

users.pl show
users.pl show username=admin
users.pl create username=jim first_name=Jim last_name=Bo email=jim@bo.com password=jpass
users.pl update username=jim first_name=James
users.pl delete username=jim

Modify Roles...

roles.pl show
roles.pl show name=ROLE_SEARCH
roles.pl create name=ROLE_SECRET users=hank users=sam filters=status:secret
roles.pl update name=ROLE_SEARCH filters=status:public
roles.pl append name=ROLE_ADMIN users=jim users=sam groups=executives
roles.pl delete name=ROLE_SECRET users=hank
roles.pl delete name=ROLE_OLD

Pause, Resume All Background Jobs...

maintenance.pl pause
maintenance.pl resume ds=5 ds=7 activity=9

Execute Searches (with optional filters)

search.pl "gtk gnome"
search.pl "gtk -gnome"
search.pl "+gtk +gnome" "mimeType:text/html" 

Recipes

Indexing Some Data Sources

  1. Start up LWE
  2. Create a datasourse using files on the same server as LWE
    ds.pl create name=localdocs type=FileSystemDataSource path=/usr/share/gtk-doc/html crawl_depth=100 include_subdirectories=true
    
  3. Schedule the 'localdocs' datasource to be indexed every 30 minutes starting now
    ds.pl schedule name=localdocs active=true period=1800 start_time=now
    
  4. Create a datasourse using a remote HTTP server
    ds.pl create name=solrwiki type=WebDataSource url=http://wiki.apache.org/solr/ crawl_depth=1
    
  5. Schedule the 'solrwiki' datasource to be indexed once right now
    ds.pl schedule name=solrwiki active=true period=0 start_time=now
    
  6. periodically check the 'status' of your datasources to see when the initial indexing is done (look for: "running" : false)
    ds.pl status
    
  7. Execute some searches in your browser, ie: http://localhost:8989/search?q=configuration
  8. Searches can also be executed via the REST API using search.pl, ie...
    search.pl configuration
    

Indexing and Activating "Filters" for Certain Users

  1. Start up LWE
  2. Modify the ROLE_SEARCH so by default users who can load the search UI can only search for HTML files
    roles.pl update name=ROLE_SEARCH filters=mimeType:text/html
    
  3. Create a new user named jim
    users.pl create username=jim first_name=Jim last_name=Bo email=jim@bo.com password=jimpass
    
  4. Give user jim access to the search UI
    roles.pl append name=ROLE_SEARCH users=jim
    
  5. Give user jim special access to all docs via a new role
    roles.pl create name=ROLE_SEE_ALL users=jim filters=*:*
    
  6. Create a datasourse of a directory containing HTML files as well as other plain text files
    ds.pl create name=simple type=FileSystemDataSource path=/usr/share/gtk-doc/html crawl_depth=100 include_subdirectories=true
    
  7. Schedule the datasource to be indexed once right now
    ds.pl schedule name=simple active=true period=0 start_time=now
    
  8. periodically check the 'status' of your datasource to see when the initial indexing is done (look for: "running" : false)
    ds.pl status name=simple
    
  9. Execute a search in your browser, ie: http://localhost:8989/search?q=*:*
  10. As you execute various searches you should only see HTML documents (note the "Type" Facet in the right hand navigation column)
  11. Click the "Login" link in the uper right corner of search pages to go to the Login page: http://localhost:8989/login
  12. Login as user "jim" with password "jimpass"
  13. Execute the same searches as before, ie: http://localhost:8989/search?q=*:*
  14. As you execute various searches you should now see all documents (note the "Type" Facet in the right hand navigation column)

Indexing and Activating "Click Boosting"

  1. Start up LWE
  2. Update your settings to enabled click tracking
    settings.pl update click_enabled=true
    
  3. Create a datasourse
    ds.pl create name=local_click_ds type=FileSystemDataSource path=/usr/share/gtk-doc/html crawl_depth=100 include_subdirectories=true
    
  4. Schedule the datasource to be indexed every 30 minutes starting now
    ds.pl schedule name=local_click_ds active=true period=1800 start_time=now
    
  5. Schedule the click processing activity to run every 10 minutes
    activities.pl create type=click active=true period=600 start_time=now
    
  6. periodically check the 'status' of your datasource to see when the initial indexing is done (look for: "running" : false)
    ds.pl status name=local_click_ds
    
  7. Execute a search in your browser, ie: http://localhost:8989/search?q=gnome
  8. As you execute searches and click on results, you should see the documents you click on filter up to the top of those searches as the click processing activity runs every 10 minutes.

Pause and Resume All Background Jobs for Maintenance

  1. Start up LWE
  2. Update your settings to enabled click tracking
    settings.pl update click_enabled=true
    
  3. Create a datasourse
    ds.pl create name=local_click_ds type=FileSystemDataSource path=/usr/share/gtk-doc/html crawl_depth=100 include_subdirectories=true
    
  4. Schedule the datasource to be indexed every 30 minutes starting now
    ds.pl schedule name=local_click_ds active=true period=1800 start_time=now
    
  5. Schedule the click processing activity to run every 10 minutes
    activities.pl create type=click active=true period=600 start_time=now
    
  6. Pause all active datasource schedules and activities, blocking until any currently running datasources and activities are finished
    maintenance.pl pause
    

    This command should output something like the following...

    $ maintenance.pl pause
    De-Activating activity #9: http://localhost:8888/api/collections/collection1/activities/9
    De-Activating schedule of ds#5: http://localhost:8888/api/collections/collection1/datasources/5/schedule
    Waiting for any currently running Activities to finish...
    ...Done!
    Waiting for any currently running DataSources to finish...
    ...Done!
    Run this command to resume everything that was de-activated...
    maintenance.pl resume activity=9 ds=5
    
  7. Perform whatever maintenance is needed
  8. When you are ready, run the command mentioned in the output of the "Pause" step to resume scheduled datasource and activity processing
    maintenance.pl resume activity=9 ds=5
    
  9. Your datasource and click activity will now continue to be run on the previously specified schedules.
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.