This section describes how to configure crawling and indexing with the LucidWorks Platform.
Getting the index structure right can be complex, even though we've tried to make it as easy as possible with our default configuration. However, every implementation is unique, so you'll want to spend some time thinking about your content and whether you need to adjust some defaults:
Supported Filetypes
How Documents Map to Fields
Customizing the Schema
Suppressing Stop Word Indexing
For the most part, crawling only requires configuring a data source with the UI or the API and starting the crawl. However, if using batch crawling, Access Control Lists, databases containing binary data, or an "external" crawler, there is additional configuration you'll want to do:
Troubleshooting Document Crawling
Batch (Split) Crawling
Crawling Windows Shares With Access Control Lists
Suggestions for External Data Source Documents
Indexing Binary Data Stored in a Database
Integration with External Pipelines
Deleting the Index
Labels
Page: How Documents Map To Fields
Page: Customizing the Schema
Page: Suppressing Stop Word Indexing
Page: Troubleshooting Document Crawling
Page: Batch (Split) Crawling
Page: Crawling Windows Shares with Access Control Lists
Page: Suggestions for External Data Source Documents
Page: Indexing Binary Data Stored in a Database
Page: Integration with External Pipelines
Page: Deleting the Index