Dynamic fields are those which are not explicitly defined, but are created during indexing based on some criteria, such as a prefix or suffix on the field name. For example, a data set may have a number of fields which end in "_b". Using dynamicField functionality in Solr, it's possible to add the content of those fields to the index without having to specify each one of them in the schema.xml file. The Dynamic Fields API allows for accessing, modifying, or adding dynamic field definitions to a Collection schema.
Dynamic fields cannot be used for faceting, highlighting, de-duplication, or MoreLikeThis, unlike explicit fields. However, a dynamic field can be converted to a explicit field with the Fields API, at which point those attributes can be enabled and the field used for those features.
| By default, LucidWorks Search includes a dynamic field declaration of *. This rule allows the LucidWorks Search crawlers to add fields as needed while processing crawled documents. If not using the crawlers, or are sure of how documents will be parsed by the crawlers, this rule could be removed. See also Customizing the Field Schema for more information on default dynamic rules. |
API Entry Points
/api/collections/collection/dynamicfields: get all dynamic fields and their attributes for a collection or create a new dynamic field.
/api/collections/collection/dynamicfields/name: update, delete, or get details for a particular dynamic field.
Get a List of Dynamic Fields and Attributes for a Collection
GET /api/collections/collection/dynamicfields
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
Query Parameters
None.
Output
Output Content
| Key | Type | Description |
|---|---|---|
| name | string | The name of the dynamic field. Dynamic field names are case sensitive. Currently a field name must consist of only A-Z, a-z, 0-9, - or _, and either begin or end (but not both) with *. Examples of legal names include attr_*, *_t or *. |
| copy_fields | list <string> | A list of field names that this field will be copied to. |
| field_type | string | A valid field type defined in schema.xml, which can be created or modified with the FieldType API. The field type setting controls how a field is analyzed. There are many options available, and more can be added by adding a new plugin to the schema.xml file. It is crucial to understand the underlying values for a field in order to correctly set its type. For full text fields, a text field type is generally the desired setting so individual words in the text are searchable. There are various text field types, most of which are language-specific. However, when a text field value is to be taken literally as-is (exact match only, or for faceting), the "string" type is likely the right choice. There are also types for numeric data, including double, float, integer, and long (and variants of each suitable for sorting: sdouble, sfloat, sint, and slong). The date field accepts dates in the form "1995-12-31T23:59:59.999Z", with the fractional seconds optional, and trailing "Z" mandatory. If you change a field type, we strongly recommend reindexing. |
| index_for_autocomplete | boolean | Set to true if these fields will be used as a source for autocomplete. This allows terms from these fields to be used in creation of an auto-complete index that will be created by default at the time of indexing. All fields selected for use in auto-complete are combined into a single "autocomplete" field for use in search suggestions. If you change this setting, we recommend that you recreate the auto-complete index as described in Auto-Complete of User Queries. |
| index_for_spellcheck | boolean | Set to true if these fields will be used as a source for spellchecking. This allows terms from these fields to be used in the creation of a spell check index. All fields selected for use in spell checking are combined into a single "spell" field for use in search suggestions. |
| indexed | boolean | Set to true if these fields will be indexed for full text search. An indexed field is searchable on the words (or exact value) as determined by the field type. Unindexed fields are useful to provide the search client with metadata for display. For example, URL may not be a valuable search term, but it is very valuable information to show users in their results list. For performance reasons, a best practice is to index as few fields as necessary to still give users a satisfactory search experience. If you change this setting, you must reindex all documents. |
| multi_valued | boolean | Set to true if these fields will be a 'multi_valued' field. Enable this if the document could have multiple values for a field, such as multiple categories or authors. We recommend that you reindex all documents after changing this setting. |
| omit_tf | boolean | The omit_tf attribute sets Solr's omitTermFreqAndPositions attribute in the schema. If true, term frequency and position information will not be indexed. Set to true this if the number of times a term occurs in a document (term frequency) and the proximity of a term to other terms (position) should NOT be stored. This may be useful for fields that are indexed but not used for searching. This option should not be enabled for text fields (for example, field type text_en) since it would prevent the proper operation of phrase queries and other proximity operators such as NEAR which depend on position information. This attribute works in conjunction with the omit_positions attribute; see the description of that attribute for valid combinations of the attributes. |
| omit_positions | boolean | The omit_positions attribute sets Solr's omitPositions attribute in the schema. If true, term position information will not be indexed. Set to true this if the proximity of a term to other terms should NOT be stored. This attribute works with the omit_tf attribute in that it would be possible to remove information about term frequency while retaining proximity information. There are three possible valid combinations of omit_tf and omit_positions:
|
| stored | boolean | Set to true if the original unanalyzed text will be stored. The fields can be stored independently of indexing, and made available in the results sent to to a search client. Reindexing is not necessary when changing the stored field flag, though fields in documents will remain as they were when they were originally indexed until they are reindexed. |
| term_vectors | boolean | This attribute is for expert use only with Solr's TermVectorComponent. It may help you achieve better highlighting and MoreLikeThis performance at the expense of a larger index. For more information, see http://wiki.apache.org/solr/FieldOptionsByUseCase. |
Return Codes
200: OK
404: Not Found
Examples
Get a list of all dynamic fields for the default LucidWorks collection "collection1":
Input
curl http://localhost:8888/api/collections/collection1/dynamicfields
Output
[
{
"field_type":"string",
"multi_valued":true,
"indexed":true,
"name":"attr_*",
"term_vectors":false,
"index_for_spellcheck":false,
"index_for_autocomplete":false,
"omit_tf":true,
"stored":true,
"copy_fields":[ ],
"omit_positions":true
},
{
"field_type":"date",
"multi_valued":false,
"indexed":true,
"name":"*_dt",
"term_vectors":false,
"index_for_spellcheck":false,
"index_for_autocomplete":false,
"omit_tf":true,
"stored":true,
"copy_fields":[ ],
"omit_positions":true
}
]
Create a New Dynamic Field
POST /api/collections/collection/dynamicfields
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
Query Parameters
None.
Input Content
JSON block with one or more field attribute key/value pairs.
| When creating a new dynamic field rule, the default values of some attributes are inherited from the field_type specified when the rule is created. These exceptions are noted below. |
| Key | Type | Required | Default | Description |
|---|---|---|---|---|
| name | string | Yes | No default | The name of the dynamic field. Dynamic field names are case sensitive. Currently a field name must consist of only A-Z, a-z, 0-9, - or , and either begin or end (but not both) with *. Examples of legal names include attr, _t or *. |
| copy_fields | list <string> | No | null | A list of field names that this field will be copied to. |
| field_type | string | Yes | No default | A valid field type defined in schema.xml. The field type setting controls how a field is analyzed. There are many options available, and more can be added by adding a new plugin to the schema.xml file. It is crucial to understand the underlying values for a field in order to correctly set its type. For full text fields, such as "title", "body", or "description", a text field type is generally the desired setting so individual words in the text are searchable. There are various text field types, most of which are language-specific. However, when a text field value is to be taken literally as-is (exact match only, or for faceting), the "string" type is likely the right choice. There are also types for numeric data, including double, float, integer, and long (and variants of each suitable for sorting: sdouble, sfloat, sint, and slong). The date field accepts dates in the form "1995-12-31T23:59:59.999Z", with the fractional seconds optional, and trailing "Z" mandatory. New field types can be created with the FieldTypes API. If you change a field type, we strongly recommend reindexing. |
| index_for_autocomplete | boolean | No | false | Set to true if these fields should be used as a source for autocomplete. This allows terms from these fields to be used in creation of an auto-complete index that will be created by default at the time of indexing. All fields selected for use in auto-complete are combined into a single "autocomplete" field for use in search suggestions. If you change this setting, we recommend that you recreate the auto-complete index as described in Auto-Complete of User Queries. |
| index_for_spellcheck | boolean | No | false | Set to true if these fields should be used as a source for spellchecking. This allows terms from these fields to be used in the creation of a spell check index. All fields selected for use in spell checking are combined into a single "spell" field for use in search suggestions. |
| indexed | boolean | No | Inherited from the field_type | Set to true if these fields should be indexed for full text search. Indexed fields are searchable on the words (or exact value) as determined by the field type. Unindexed fields are useful to provide the search client with metadata for display. For example, URL may not be a valuable search term, but it is very valuable information to show users in their results list. For performance reasons, a best practice is to index as few fields as necessary to still give users a satisfactory search experience. If you change this setting, you must reindex all documents. |
| multi_valued | boolean | No | Inherited from the field_type | Set to true if these fields should be a 'multi_valued' field. Enable this if the document could have multiple values for a field, such as multiple categories or authors. We recommend that you reindex all documents after changing this setting. |
| omit_tf | boolean | No | Inherited from the field_type | The omit_tf attribute sets Solr's omitTermFreqAndPositions attribute in the schema for this collection. If true, term frequency and position information will not be indexed. Set to true if the number of times a term occurs in a document (term frequency) and the proximity of a term to other terms (position) should NOT be stored. This may be useful for fields that are indexed but not used for searching. This option should not be enabled for text fields (for example, field type text_en) since it would prevent the proper operation of phrase queries and other proximity operators such as NEAR which depend on position information. |
| omit_positions | boolean | No | Inherited from the field_type | The omit_positions attribute sets Solr's omitPositions attribute in the schema for this collection. If true, term position information will not be indexed. Enable this if the proximity of a term to other terms should not be stored. This attribute works with the omit_tf attribute in that it would be possible to remove information about term frequency while retaining proximity information. There are three possible valid combinations of omit_tf and omit_positions:
|
| stored | boolean | No | true | Set to true if the original unanalyzed text should be stored. Fields can be stored independently of indexing, and made available in the results sent to to a search client. Reindexing is not necessary when changing the stored field flag, though fields in documents will remain as they were when they were originally indexed until they are reindexed. |
| term_vectors | boolean | No | Inherited from the field_type | This attribute is for expert use only with Solr's TermVectorComponent. It may help you achieve better highlighting and MoreLikeThis performance at the expense of a larger index. For more information, see http://wiki.apache.org/solr/FieldOptionsByUseCase. |
Output
Output Content
JSON representation of the created dynamic field.
Return Codes
201: Created
422: Unprocessable Entity
This error may have several different conditions:
- Name must be specified
- You must specify a field_type for the field
- A field already exists with the same name
Examples
Input
curl -H 'Content-type: application/json' -d
{
"name":"*_sh",
"indexed":true,
"stored":true,
"field_type":"text_en"
}' http://localhost:8888/api/collections/collection1/dynamicfields
Output
{
"field_type":"text_en",
"multi_valued":false,
"indexed":true,
"name":"*_sh",
"term_vectors":false,
"index_for_spellcheck":false,
"index_for_autocomplete":false,
"omit_tf":false,
"stored":true,
"copy_fields":[ ],
"omit_positions":false
}
Get Attributes for a Dynamic Field
GET /api/collections/collection/dynamicfields/name
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
| name | The dynamic field name. |
Query Parameters
None.
Input Content
None.
Output
Output Content
A JSON map of keys to values. For a list of keys, see GET: Output Content.
Return Codes
204: No Content
404: Not Found
Examples
Input
curl http://localhost:8888/api/collections/collection1/dynamicfields/attr_*
Output
{
"field_type":"string",
"multi_valued":true,
"indexed":true,
"name":"attr_*",
"term_vectors":false,
"index_for_spellcheck":false,
"index_for_autocomplete":false,
"omit_tf":true,
"stored":true,
"copy_fields":[ ],
"omit_positions":true
}
Update a Dynamic Field
PUT /api/collections/collection/dynamicfields/name
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
| name | The dynamic field name. |
Query Parameters
None.
Input Content
JSON block with one or more key to value mappings. Any keys you don't edit will keep their existing values. For a list of keys, see POST: Input Content
Output
Output Content
None.
Return Codes
204: No Content
404: Not Found
Examples
Edit the "attr_*" dynamic field so that it is multi-valued:
Input
curl -X PUT -H 'Content-type: application/json'
-d '{"multi_valued":true}'
http://localhost:8888/api/collections/collection1/dynamicfields/attr_*
Output
None. (Check the dynamic field properties to confirm changes.)
Delete a Field
DELETE /api/collections/collection/dynamicfields/name
Note that deleting a field only removes it as an option for new documents; existing documents will retain this field, even after it's been deleted. Also, listing all fields in the collection will still show the field after it's been deleted.
Input
Path Parameters
| Key | Description |
|---|---|
| collection | The collection name. |
| name | The dynamic field name. |
Query Parameters
none
Input content
None
Output
Output Content
None
Return Codes
204: No Content
404: Not Found
Examples
Delete the "*_sh" dynamic field.
Input
curl -X DELETE http://localhost:8888/api/collections/collection1/dynamicfields/*_sh
Output
None.