The $LWE_HOME/app/examples/python directory contains utilities demonstrating many of the LucidWorks Enterprise REST API features from Python code. These utilities can be used to assist people in managing their LucidWorks Enterprise installation, or as an example of how to write Python code as part of customer applications that will interact with LucidWorks and Solr.
- Dependencies
- Basic Usage
- Get Basic Information About the Collection
- View or Modify Settings
- View, Create, Modify, or Delete Data Sources
- Modify the Schedule of an Existing Data Source
- View the Status and Indexing History of an Existing Data Source
- View, Create, Modify, or Delete Activities
- View the Status and History of Existing Activities
- View, Create, Modify, or Delete Fields
- View, Create, Modify, or Delete Users
- Modify Roles
- Execute Searches with Optional Filters
- Recipes
Dependencies
All of these tools require that the "httplib2" library be available.
All of these tools assume that the main URL for LWE is "http://localhost:8888"
and that the URL for the UI is "http://localhost:8989"
If LWE is running elsewhere, please set the LWE_URL and LWE_UI_URL Environment
variables appropriately in the shell where you will be using these tools.
All of these tools work with with "collection1" by default. To use a
different collection, please set the LWE_COLLECTION Environment variable
appropriately in the shell where you will be using these tools.
Basic Usage
| All of these tools can be run without any arguments to see "help" information about their usage. |
Get Basic Information About the Collection
info.py show info.py show index_num_docs index_size free_disk_space
View or Modify Settings
settings.py show
settings.py show boost_recent stopword_list
settings.py update boost_recent=false stopword_list=a stopword_list=an stopword_list=the
(Note that you can create a list by specifying the same setting key multiple times.)
View, Create, Modify, or Delete Data Sources
ds.py show ds.py show id=74 ds.py show name=simple ds.py create name=simple type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html ds.py create name=docs type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100 ds.py update id=74 crawl_depth=999 ds.py update name=simple crawl_depth=999 ds.py update id=74 name=new_name crawl_depth=999 ds.py delete id=74 ds.py delete name=simple
Modify the Schedule of an Existing Data Source
ds.py schedule id=74 active=true period=60 start_time=2076-03-06T12:34:56-0800 ds.py schedule id=74 active=true period=60 start_time=now ds.py schedule name=simple active=true period=60 start_time=now
View the Status and Indexing History of an Existing Data Source
ds.py status ds.py status id=74 ds.py status name=simple ds.py history id=74 ds.py history name=simple
View, Create, Modify, or Delete Activities
activities.py show activities.py show id=68 activities.py create type=click active=true period=60 start_time=2076-03-06T12:34:56-0800 activities.py create type=click active=true period=60 start_time=now activities.py update id=68 period=300 activities.py delete id=68
View the Status and History of Existing Activities
activities.py status activities.py status id=68 activities.py history id=68
View, Create, Modify, or Delete Fields
fields.py show fields.py show name=mimeType fields.py create name=category field_type=string facet=true fields.py update name=category search_by_default=true fields.py delete name=category
View, Create, Modify, or Delete Users
users.py show users.py show username=admin users.py create username=jim authorization=admin password=jimpass users.py update username=jim authorization=search users.py delete username=jim
Modify Roles
roles.py show
roles.py show name=DEFAULT
roles.py create name=SECRET users=hank users=sam filters=status:secret
roles.py update name=DEFAULT filters=status:public
roles.py append name=SECRET users=jim users=bob groups=executives
roles.py delete name=SECRET users=hank
roles.py delete name=OLD
Execute Searches with Optional Filters
search.py "gtk gnome" search.py "gtk -gnome" search.py "+gtk +gnome" "mimeType:text/html"
Recipes
Indexing Data Sources
- Start LucidWorks Enterprise
- Create a data source using files on the same server as LWE:
ds.py create name=localdocs type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100
- Schedule the "localdocs" data source to be indexed every 30 minutes starting now:
ds.py schedule name=localdocs active=true period=1800 start_time=now - Create a data source using a remote HTTP server:
ds.py create name=solrwiki type=web crawler=lucid.aperture url=http://wiki.apache.org/solr/ crawl_depth=1 - Schedule the 'solrwiki' data source to be indexed once right now:
ds.py schedule name=solrwiki active=true period=0 start_time=now - Periodically check the 'status' of your data sources to see when the initial indexing is done (look for "crawl_state"):
ds.py status
- Execute some searches in your browser: http://localhost:8989/collections/collection1/search?search%5Bq%5D=configuration
- Searches can also be executed via the REST API using search.py:
search.py configuration
Indexing and Activating Filters for Certain Users
- Start LucidWorks Enterprise
- Create a new role HTML_ONLY to restrict some users and groups to only searching for HTML documents
roles.py create name=HTML_ONLY filters=mimeType:text/html
- Create a new search user named jim:
users.py create username=jim password=jimpass authorization=search
- Add "jim" to the list of users with the HTML_ONLY role:
roles.py append name=HTML_ONLY users=jim
- Create a data source of a directory containing HTML files as well as other plain text files:
ds.py create name=simple type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100
- Run the data source once right now:
ds.py start name=simple
- periodically check the 'status' of your data source to see when the initial indexing is done (look for "crawl_state"):
ds.py status name=simple
- Use your browser to login as the "jim" (with password "jimpass") and execute a search: http://localhost:8989/collections/collection1/search?search%5Bq%5D=
- As you execute various searches you should only see HTML documents (note the "Type" Facet in the right hand navigation column)
- Click the "Sign Out" link in the upper-right corner of search pages and Log in again as the "admin" user: http://localhost:8989/users/sign_out
- Execute the same searches as before: http://localhost:8989/collections/collection1/search?search%5Bq%5D=
- As you execute various searches you should now see all documents (note the "Type" Facet in the right hand navigation column)
Indexing and Activating Click Boosting
- Start LucidWorks Enterprise
- Update your settings to enabled click tracking:
settings.py update click_enabled=true - Create a data source:
ds.py create name=local_click_ds type=file crawler=lucid.aperture path=/usr/share/gtk-doc/html crawl_depth=100
- Schedule the data source to be indexed every 30 minutes starting now:
ds.py schedule name=local_click_ds active=true period=1800 start_time=now - Schedule the click processing activity to run every 10 minutes:
activities.py create type=click active=true period=600 start_time=now - Periodically check the status of your data source to see when the initial indexing is done (look for "crawl_state"):
ds.py status name=local_click_ds
- Execute a search in your browser: http://localhost:8989/collections/collection1/search?search%5Bq%5D=gnome
- As you execute searches and click on results, you should see the documents you click on filter up to the top of those searches as the click processing activity runs every 10 minutes.