Support Resources

LucidWorks Forum
KnowledgeBase

LucidWorks Platform v2.0

PDF Version

Older Versions

LWE Guide 1.8
LWE Guide 1.7
LWE Guide 1.6

This is the documentation for LucidWorks Platform v2.0, the latest release is v2.1.

Skip to end of metadata
Go to start of metadata

The $LWE_HOME/app/examples/python directory contains utilities demonstrating many of the LucidWorks Enterprise REST API features from Python code. These utilities can be used to assist people in managing their LucidWorks Enterprise installation, or as an example of how to write Python code as part of customer applications that will interact with LucidWorks and Solr.

Dependencies

All of these tools require that the "httplib2" library be available.

All of these tools assume that the main URL for LWE is "http://localhost:8888"
and that the URL for the UI is "http://localhost:8989"

If LWE is running elsewhere, please set the LWE_URL and LWE_UI_URL Environment
variables appropriately in the shell where you will be using these tools.

All of these tools work with with "collection1" by default. To use a
different collection, please set the LWE_COLLECTION Environment variable
appropriately in the shell where you will be using these tools.

Basic Usage

All of these tools can be run without any arguments to see "help" information about their usage.

Get Basic Information About the Collection

info.py show
info.py show index_num_docs index_size free_disk_space

View or Modify Settings

settings.py show
settings.py show boost_recent stopword_list
settings.py update boost_recent=false stopword_list=a stopword_list=an stopword_list=the

(Note that you can create a list by specifying the same setting key multiple times.)

View, Create, Modify, or Delete Data Sources

ds.py show
ds.py show id=74
ds.py show name=simple
ds.py create name=simple type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html
ds.py create name=docs type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100 
ds.py update id=74 crawl_depth=999
ds.py update name=simple crawl_depth=999
ds.py update id=74 name=new_name crawl_depth=999
ds.py delete id=74
ds.py delete name=simple

Modify the Schedule of an Existing Data Source

ds.py schedule id=74 active=true period=60 start_time=2076-03-06T12:34:56-0800
ds.py schedule id=74 active=true period=60 start_time=now
ds.py schedule name=simple active=true period=60 start_time=now

View the Status and Indexing History of an Existing Data Source

ds.py status
ds.py status id=74
ds.py status name=simple
ds.py history id=74
ds.py history name=simple

View, Create, Modify, or Delete Activities

activities.py show
activities.py show id=68
activities.py create type=click active=true period=60 start_time=2076-03-06T12:34:56-0800
activities.py create type=click active=true period=60 start_time=now
activities.py update id=68 period=300
activities.py delete id=68

View the Status and History of Existing Activities

activities.py status
activities.py status id=68
activities.py history id=68

View, Create, Modify, or Delete Fields

fields.py show
fields.py show name=mimeType
fields.py create name=category field_type=string facet=true
fields.py update name=category search_by_default=true
fields.py delete name=category

View, Create, Modify, or Delete Users

users.py show
users.py show username=admin
users.py create username=jim authorization=admin password=jimpass
users.py update username=jim authorization=search
users.py delete username=jim

Modify Roles

roles.py show
roles.py show name=DEFAULT
roles.py create name=SECRET users=hank users=sam filters=status:secret
roles.py update name=DEFAULT filters=status:public
roles.py append name=SECRET users=jim users=bob groups=executives
roles.py delete name=SECRET users=hank
roles.py delete name=OLD

Execute Searches with Optional Filters

search.py "gtk gnome"
search.py "gtk -gnome"
search.py "+gtk +gnome" "mimeType:text/html"

Recipes

Indexing Data Sources

  1. Start LucidWorks Enterprise
  2. Create a data source using files on the same server as LWE:
    ds.py create name=localdocs type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100 
    
  3. Schedule the "localdocs" data source to be indexed every 30 minutes starting now:
    ds.py schedule name=localdocs active=true period=1800 start_time=now
    
  4. Create a data source using a remote HTTP server:
    ds.py create name=solrwiki type=web crawler=lucid.aperture url=http://wiki.apache.org/solr/ crawl_depth=1
    
  5. Schedule the 'solrwiki' data source to be indexed once right now:
    ds.py schedule name=solrwiki active=true period=0 start_time=now
    
  6. Periodically check the 'status' of your data sources to see when the initial indexing is done (look for "crawl_state"):
    ds.py status
    
  7. Execute some searches in your browser: http://localhost:8989/collections/collection1/search?search%5Bq%5D=configuration
  8. Searches can also be executed via the REST API using search.py:
    search.py configuration
    

Indexing and Activating Filters for Certain Users

  1. Start LucidWorks Enterprise
  2. Create a new role HTML_ONLY to restrict some users and groups to only searching for HTML documents
    roles.py create name=HTML_ONLY filters=mimeType:text/html
    
  3. Create a new search user named jim:
    users.py create username=jim password=jimpass authorization=search
    
  4. Add "jim" to the list of users with the HTML_ONLY role:
    roles.py append name=HTML_ONLY users=jim
    
  5. Create a data source of a directory containing HTML files as well as other plain text files:
    ds.py create name=simple type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100
    
  6. Run the data source once right now:
    ds.py start name=simple
    
  7. periodically check the 'status' of your data source to see when the initial indexing is done (look for "crawl_state"):
    ds.py status name=simple
    
  8. Use your browser to login as the "jim" (with password "jimpass") and execute a search: http://localhost:8989/collections/collection1/search?search%5Bq%5D=
  9. As you execute various searches you should only see HTML documents (note the "Type" Facet in the right hand navigation column)
  10. Click the "Sign Out" link in the upper-right corner of search pages and Log in again as the "admin" user: http://localhost:8989/users/sign_out
  11. Execute the same searches as before: http://localhost:8989/collections/collection1/search?search%5Bq%5D=
  12. As you execute various searches you should now see all documents (note the "Type" Facet in the right hand navigation column)

Indexing and Activating Click Boosting

  1. Start LucidWorks Enterprise
  2. Update your settings to enabled click tracking:
    settings.py update click_enabled=true
    
  3. Create a data source:
    ds.py create name=local_click_ds type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100 
    
  4. Schedule the data source to be indexed every 30 minutes starting now:
    ds.py schedule name=local_click_ds active=true period=1800 start_time=now
    
  5. Schedule the click processing activity to run every 10 minutes:
    activities.py create type=click active=true period=600 start_time=now
    
  6. Periodically check the status of your data source to see when the initial indexing is done (look for "crawl_state"):
    ds.py status name=local_click_ds
    
  7. Execute a search in your browser: http://localhost:8989/collections/collection1/search?search%5Bq%5D=gnome
  8. As you execute searches and click on results, you should see the documents you click on filter up to the top of those searches as the click processing activity runs every 10 minutes.

Labels

lwdg lwdg Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.