The Settings API allows for accessing and modifying settings for a given collection. Note that some of the settings listed below cannot be changed by customers with LucidWorks Search hosted on AWS or Azure.
API Entry Points
/api/collections/collection/settings: get all settings for a collection or update settings.
/api/collections/collection/settings/name: get a particular setting
Get All Settings for a Collection
GET /api/collections/collection/settings
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
Query Parameters
None.
Output
Output Content
| Key | Type | Description |
|---|---|---|
| auto_complete | boolean | Is true if auto-complete is enabled for use in the LucidWorks Search default search interface. Note that this also requires setting the auto-complete activity to run at regular intervals. For more information, see Auto-Complete of User Queries. |
| boosts | Solr function query | Defines the boost to apply to each query. The default boost for the Lucid Query Parser prefers more recent documents. |
| boost_recent | boolean | Is true if the lucid request handler should boost recent documents. |
| click_enabled | boolean | Is true if Click Scoring is enabled. If enabling this feature with NearRealTime (NRT) search (the update_handler_autosoftcommit_* parameters discussed below), please refer to the Click Scoring Relevance Framework section for more information about how Click Scoring and NRT impact document updates. This feature is available in LucidWorks Search on-premise only. |
| click_boost_data | string | The path to Click Scoring boost data. This feature is available in LucidWorks Search on-premise only. |
| click_boost_field | string | The field name prefix used by Click fields. This feature is available in LucidWorks Search on-premise only. |
| click_index_location | string | The path to Click boost index (LucidWorks Search on-premise only). |
| de_duplication | string | In LucidWorks Search, duplicates can be identified by calculating a hash that identifies very similar documents. While this setting enables de-duplication generally, specific fields should be selected as being used for de-duplication, which can be done with the Fields API or the Field Configuration screen. If no fields are selected as being the basis for determining duplicate documents, then all fields of a document are used as the basis for judging duplicate documents. You can choose from three possible methods of handling duplicates:
|
| default_sort | string | Default sort method - valid values are: relevance, date, random. |
| display_facets | boolean | Is true if the LucidWorks Search default search interface should display facets. |
| display_fields | string | Defines the fields to use for display of results to users. Primarily used to add pseudo-fields to documents, but could be used with "real" fields also. This parameter only applies when using the lucid handler (query parser). |
| elevations | JSON map | Defines the documents to be elevated or excluded from results for a specific query. It uses Solr's QueryElevationComponent, which is enabled by default in LucidWorks Search. This API is an interface to manage the elevate.xml file, which stores the elevation definitions that are used for queries. The elevations file is located in the conf directory for each collection ($LWE_HOME/conf/solr/cores/collection/conf). The structure of the elevate.xml file is an XML file defining the query and the IDs of the documents that are to be elevated or excluded. The API uses a JSON map to write to this file with a structure of:
{"elevations":
{"query":
[{"doc":"docID","exclude":false}]
}
}
It is also possible to define elevations or exclusions using the built-in Search UI, which includes a "pin" or a "minus" next to every result to allow you to add it to the elevations list as either a required document or an excluded document. |
| main_index_ lock_type |
string | Defines which Lucene LockFactory to use. When applying changes to an index the IndexWriter requires a lock on the directory. The options are:
|
| main_index_ max_buffered_docs |
integer | Allows setting the maxBufferedDocs parameter in the solrconfig.xml file for the collection, which sets the number of document updates to buffer in memory before they are flushed to disk and added to the current index segment. It is generally preferred to use the main_index_ram_buffer_size_mb, but if both settings are defined, a flush will occur when either limit is reached. |
| main_index_ max_merge_docs |
integer | Allows setting the maxMergeDocs parameter in the solrconfig.xml file for the collection, which sets the maximum number of documents for a single segment. Once this limit is reached, the segment is closed and a new one is created. A segment merge, as defined by main_index_merge_factor may also occur at this time. |
| main_index_ merge_factor |
integer | Allows setting the mergeFactor parameter in the solrconfig.xml file for the collection, which defines how many segments the index is allowed to have before they are coalesced into one segment. When the index is updated, the new data is added to the most recently opened segment. When that segment is full, a new segment is created and subsequent updates are placed there (defining when a segment is full is done with the main_index_max_buffered_docs and main_index_ram_buffer_size_mb settings). When the the main_index_merge_factor is reached, the segments are merged into a single larger segment. See the section on mergeFactor in the Solr Reference Guide for more information. |
| main_index_ ram_buffer_size_mb |
integer | Allows setting the ramBufferSizeMb parameter in the solrconfig.xml file for the collection, which sets the amount of memory space (in megabytes) document updates can use before they are flushed to the current index segment. This setting is generally preferable to main_index_max_buffered_docs, but if both settings are defined, a flush will occur when either limit is reached. |
| main_index_ term_index_interval |
integer | Allows setting the TermIndexInterval for the index and determines the amount of computation required per query term, regardless of the number of documents. This allows some level of control over the time query processing takes. Large values cause less memory to be used by the IndexReader, but slows random-access to terms. Smaller values cause more memory to be used by the IndexReader, but will speed random-access to terms. A large index with user-entered queries may benefit from a larger main_index_term_index_interval because query processing is dominated by frequency and positional data processing and not by term lookup. A system that experiences a great deal of wildcard queries may benefit from a smaller value for this setting. |
| main_index_ use_compound_file |
boolean | Allows you to set the UseCompoundFile parameter in the solrconfig.xml file for the collection. Setting this to true combines the multiple index files on disk to a single file. This setting would help avoid hitting an open file limit on those systems which restrict the number of open files allowed per process. See the section on UseCompoundFile in the Solr Reference Guide for more information. |
| main_index_ write_lock_timeout |
integer | Defines the maximum time to wait for a write lock. |
| query_parser | string | Which query parser the lucid search request handler will use - valid values are: lucid, dismax, extended dismax, lucene. |
| query_time_stopwords | boolean | Is true if stopwords will be removed at query time. |
| query_time_synonyms | boolean | Is true if synonyms should be added to queries. This will only be used if the 'lucid' query parser is selected as the default or used in the query request. |
| search_server_list | list:string | A list of Solr core URLs that the lucid request handler will use for distributed search - pass an empty list to disable distributed search. |
| show_similar | boolean | Is true if a "Find Similar" link should be displayed next to user's search results. |
| spellcheck | boolean | Is true if the LucidWorks Search default search interface should suggest spelling corrections. |
| stopword_list | list:string | A list of stopwords that will be used if 'query_time_stopwords' is enabled. |
| synonym_list | list:string | A list of synonym rules that will be used if 'query_time_synonyms' is enabled. |
| unknown_type_handling | string | A valid field type from the core's schema to use for unrecognized fields - default is text_en. |
| unsupervised_feedback | boolean | Is true if unsupervised feedback is enabled |
| unsupervised_ feedback_emphasis |
string | Defines if unsupervised feedback should emphasize "relevancy" which does an "AND" of the original query which neither includes nor excludes additional documents, or "recall" which does an "OR" of the original query which permits the feedback terms to expand the set of documents matched - default is "relevancy". |
| update_handler_ autocommit_max_docs |
integer | Allows setting the maxDocs parameter for autocommit definitions in the solrconfig.xml file for the collection. This setting defines the number of documents to queue before pushing them to the index. It works in conjunction with the update_handler_autocommit_max_time parameter in that if either limit is reached, the pending updates will be pushed to the index. |
| update_handler_ autocommit_max_time |
integer | Allows setting the maxTime parameter for autocommit definitions in the solrconfig.xml file for the collection. This setting defines the number of milliseconds to wait before pushing documents to the index. It works in conjunction with the update_handler_autocommit_max_docs parameter in that if either limit is reached, the pending updates will be pushed to the index. |
| update_handler_ autocommit_open_searcher |
boolean | Provides the option to not open a searcher on hard commit. This may be useful to minimize the size of transaction logs that keep track of uncommitted updates. The default is true, change this to false to not open a searcher. |
| update_handler_ autosoftcommit_max_docs |
integer | Allows setting the maxDocs parameter for autosoftcommit definitions in the solrconfig.xml file for the collection. "Soft" commits are used in Solr's Near RealTime search. This setting defines the number of documents to queue before pushing them to the index. It works in conjunction with the {{update_handler_ autosoftcommit_max_time}} parameter in that if either limit is reached, the documents will be pushed to the index. |
| update_handler_ autosoftcommit_max_time |
integer | Allows setting the maxTime parameter for autosoftcommit definitions in the solrconfig.xml file for the collection. "Soft" commits are used in Solr's Near RealTime search. This setting defines the number of milliseconds to wait before pushing documents to the index. It works in conjunction with the update_handler_autosoftcommit_max_docs parameter in that if either limit is reached, the documents will be pushed to the index. |
| update_server_list | complex | A map that contains two keys: 'server_list' and 'self'. 'server_list' is list:string of servers that the lucid update chain will use for distributed updates and 'self' should either be null if this server will not receive updates, or it should be a string value containing this server address if this server will receive updates - pass an empty list of servers to disable distributed update. |
Response Codes
200: OK
Examples
Get the existing settings for the collection:
Input
curl http://localhost:8888/api/collections/collection1/settings
Output
{
"auto_complete": true,
"boost_recent": true,
"boosts": [
"recip(rord(lastModified),1,1000,1000)"
],
"click_boost_data": "click-data",
"click_boost_field": "click",
"click_enabled": false,
"de_duplication": "off",
"default_sort": "relevance",
"display_facets": true,
"display_fields": [
"id","url","author","data_source_type","lastModified",
"mimeType","pageCount","title"],
"elevations": {},
"main_index_lock_type": "native",
"main_index_max_buffered_docs": -1,
"main_index_max_merge_docs": 2147483647,
"main_index_merge_factor": 10,
"main_index_ram_buffer_size_mb": 64.0,
"main_index_term_index_interval": 32,
"main_index_use_compound_file": false,
"main_index_write_lock_timeout": 1000,
"query_parser": "lucid",
"query_time_stopwords": true,
"query_time_synonyms": true,
"search_server_list": [],
"show_similar": true,
"spellcheck": true,
"stopword_list": [
"a","an","and","are","as","at","be","but","by","for","if","in","into",
"is","it","no","not","of","on","or","s","such","t","that","the",
"their","then","there","these","they","this","to","was","will","with"],
"synonym_list": [
"lawyer, attorney","one, 1","two, 2","three, 3","ten, 10",
"hundred, 100","thousand, 1000","tv, television"],
"unknown_type_handling": "text_en",
"unsupervised_feedback": false,
"unsupervised_feedback_emphasis": "relevancy",
"update_handler_autocommit_max_docs": null,
"update_handler_autocommit_max_time": 3600000,
"update_handler_autocommit_open_searcher": true,
"update_handler_autosoftcommit_max_docs": null,
"update_handler_autosoftcommit_max_time": null,
"update_server_list": null
}
Get a Particular Setting
GET /api/collections/collection/settings/name
Returns a map of settings to values for a given setting.
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
| name | The name of the setting to return. |
Query Parameters
None.
Output
Return Codes
200: OK
Examples
Determine the default parser for the collection.
Input
curl 'http://localhost:8888/api/collections/collection1/settings/query_parser'
Output:
{
"query_parser":"lucid",
}
Update Settings
PUT /api/collections/collection/settings
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
Query Parameters
None
Input Content
JSON block with values for keys to be updated.
| Key | Type | Description |
|---|---|---|
| auto_complete | boolean | Is true if auto-complete is enabled for use in the LucidWorks Search default search interface. Note that this also requires setting the auto-complete activity to run at regular intervals. For more information, see Auto-Complete of User Queries. |
| boosts | Solr function query | Defines the boost to apply to each query. The default boost for the Lucid Query Parser prefers more recent documents. |
| boost_recent | boolean | Is true if the lucid request handler should boost recent documents. |
| click_enabled | boolean | Is true if Click is enabled (LucidWorks Search on-premise only). |
| click_boost_data | string | The path to Click boost data (LucidWorks Search on-premise only). |
| click_boost_field | string | The field name prefix used by Click fields (LucidWorks Search on-premise only). |
| click_index_location | string | The path to Click boost index (LucidWorks Search on-premise only). |
| de_duplication | string | The valid values are: off, do not de-duplicate; overwrite duplicate documents; tag duplicated with a unique signature. Note that de-duplication does not work properly in SolrCloud mode. |
| default_sort | string | Default sort method - valid values are: relevance, date, random. |
| display_facets | boolean | Is true if the LucidWorks Search default search interface should display facets. |
| display_fields | string | Defines the fields to use for display of results to users. Primarily used to add pseudo-fields to documents, but could be used with "real" fields also. This parameter only applies when using the lucid handler (query parser). |
| elevations | JSON map | Defines the documents to be elevated or excluded from results for a specific query. It uses Solr's QueryElevationComponent, which is enabled by default in LucidWorks. This API is an interface to manage the elevate.xml file, which stores the elevation definitions that are used for queries. The elevations file is located in the conf directory for each collection ($LWE_HOME/conf/solr/cores/collection/conf) The structure of the elevate.xml file is an XML file defining the query and the IDs of the documents that are to be elevated or excluded. The API uses a JSON map to write to this file with a structure of:
{"elevations":
{"query":
[{"doc":"docID", "exclude":true}]
}
}
|
| main_index_ lock_type |
string | Defines which Lucene LockFactory to use. When applying changes to an index the IndexWriter requires a lock on the directory. The options are:
|
| main_index_ max_buffered_docs |
integer | Allows setting the maxBufferedDocs parameter in the solrconfig.xml file for the collection, which sets the number of document updates to buffer in memory before they are flushed to disk and added to the current index segment. It is generally preferred to use the main_index_ram_buffer_size_mb, but if both settings are defined, a flush will occur when either limit is reached. |
| main_index_ max_merge_docs |
integer | Allows setting the maxMergeDocs parameter in the solrconfig.xml file for the collection, which sets the maximum number of documents for a single segment. Once this limit is reached, the segment is closed and a new one is created. A segment merge, as defined by main_index_merge_factor may also occur at this time. |
| main_index_ merge_factor |
integer | Allows setting the mergeFactor parameter in the solrconfig.xml file for the collection, which defines how many segments the index is allowed to have before they are coalesced into one segment. When the index is updated, the new data is added to the most recently opened segment. When that segment is full, a new segment is created and subsequent updates are placed there (defining when a segment is full is done with the main_index_max_buffered_docs and main_index_ram_buffer_size_mb settings). When the the main_index_merge_factor is reached, the segments are merged into a single larger segment. See the section on mergeFactor in the Solr Reference Guide for more information. |
| main_index_ ram_buffer_size_mb |
integer | Allows setting the ramBufferSizeMb parameter in the solrconfig.xml file for the collection, which sets the amount of memory space (in megabytes) document updates can use before they are flushed to the current index segment. This setting is generally preferable to main_index_max_buffered_docs, but if both settings are defined, a flush will occur when either limit is reached. |
| main_index_ term_index_interval |
integer | Allows setting the TermIndexInterval for the index and determines the amount of computation required per query term, regardless of the number of documents. This allows some level of control over the time query processing takes. Large values cause less memory to be used by the IndexReader, but slows random-access to terms. Smaller values cause more memory to be used by the IndexReader, but will speed random-access to terms. A large index with user-entered queries may benefit from a larger main_index_term_index_interval because query processing is dominated by frequency and positional data processing and not by term lookup. A system that experiences a great deal of wildcard queries may benefit from a smaller value for this setting. |
| main_index_ use_compound_file |
boolean | Allows you to set the UseCompoundFile parameter in the solrconfig.xml file for the collection. Setting this to true combines the multiple index files on disk to a single file. This setting would help avoid hitting an open file limit on those systems which restrict the number of open files allowed per process. See the section on UseCompoundFile in the Solr Reference Guide for more information. |
| main_index_ write_lock_timeout |
integer | Defines the maximum time to wait for a write lock. |
| query_parser | string | Which query parser the lucid search request handler will use - valid values are: lucid, dismax, extended dismax, lucene. |
| query_time_stopwords | boolean | Is true if stopwords will be removed at query time. |
| query_time_synonyms | boolean | Is true if synonyms should be added to queries. This will only be used if the 'lucid' query parser is selected as the default or used in the query request. |
| search_server_list | list:string | A list of Solr core URLs that the lucid request handler will use for distributed search - pass an empty list to disable distributed search. |
| show_similar | boolean | Is true if a "Find Similar" link should be displayed next to user's search results. |
| spellcheck | boolean | Is true if the LucidWorks Search default search interface should suggest spelling corrections. |
| stopword_list | list:string | A list of stopwords that will be used if 'query_time_stopwords' is enabled. |
| synonym_list | list:string | A list of synonym rules that will be used if 'query_time_synonyms' is enabled. |
| unsupervised_feedback | boolean | Is true if unsupervised feedback is enabled |
| unsupervised_ feedback_emphasis |
string | Defines if unsupervised feedback should emphasize "relevancy" which does an "AND" of the original query which neither includes nor excludes additional documents, or "recall" which does an "OR" of the original query which permits the feedback terms to expand the set of documents matched - default is "relevancy". |
| unknown_type_handling | string | A valid field type from the core's schema to use for unrecognized fields - default is text_en. |
| update_handler_ autocommit_max_docs |
integer | Allows setting the maxDocs parameter for autocommit definitions in the solrconfig.xml file for the collection. This setting defines the number of documents to queue before pushing them to the index. It works in conjunction with the update_handler_autocommit_max_time parameter in that if either limit is reached, the pending updates will be pushed to the index. |
| update_handler_ autocommit_max_time |
integer | Allows setting the maxTime parameter for autocommit definitions in the solrconfig.xml file for the collection. This setting defines the number of milliseconds to wait before pushing documents to the index. It works in conjunction with the update_handler_autocommit_max_docs parameter in that if either limit is reached, the pending updates will be pushed to the index. |
| update_handler_ autocommit_open_searcher |
boolean | Provides the option to not open a searcher on hard commit. This may be useful to minimize the size of transaction logs that keep track of uncommitted updates. The default is true, change this to false to not open a searcher. |
| update_handler_ autosoftcommit_max_docs |
integer | Allows setting the maxDocs parameter for autosoftcommit definitions in the solrconfig.xml file for the collection. "Soft" commits are used in Solr's Near RealTime searching. This setting defines the number of documents to queue before pushing them to the index. It works in conjunction with the {{update_handler_ autosoftcommit_max_time}} parameter in that if either limit is reached, the documents will be pushed to the index. |
| update_handler_ autosoftcommit_max_time |
integer | Allows setting the maxDocs parameter for autosoftcommit definitions in the solrconfig.xml file for the collection. "Soft" commits are used in Solr's Near RealTime searching. This setting defines the number of milliseconds to wait before pushing documents to the index. It works in conjunction with the update_handler_autosoftcommit_max_docs parameter in that if either limit is reached, the documents will be pushed to the index. |
| update_server_list | complex | A map that contains two keys: 'server_list' and 'self'. 'server_list' is list:string of servers that the lucid update chain will use for distributed updates and 'self' should either be null if this server will not receive updates, or it should be a string value containing this server address if this server will receive updates - pass an empty list of servers to disable distributed update. |
Output
Output Content
None.
Return Codes
204: No Content
Examples
Turn on spell-checking for the collection.
Input
curl -X PUT -H 'Content-type: application/json'
-d '{"spellcheck":true}'
http://localhost:8888/api/collections/collection1/settings
Output
None. Check properties to confirm changes.