The Data Source History API returns historical statistics for previous data source runs. History only returns information about prior crawls for a data source. Use Data Source Jobs or Data Source Status for details on currently running crawls.
Note that some crawlers (such as the lucid.fs crawler) are "stateless", meaning the crawler can not be aware of documents deleted or modified between crawls. In crawl statistics, this can mean that deleted or updated documents are not counted as such, or that adding the total number of "new" documents in two different crawls does not equal the number of documents in the index.
/api/collections/name/datasources/id/history: Get statistics for the last 50 runs of the given data source.
Enter path parameters.
|collection||The collection name.|
|id||The data source ID.|
|id||integer||The ID of the datasource.|
|crawl_started||date string||When the crawl began.|
|crawl_stopped||date string||When the crawl finished.|
|crawl_state||string||The current state of the crawl (RUNNING, FINISHED, or STOPPED).|
|num_unchanged||32-bit integer||The number of documents found that were not modified and did not need to be indexed.|
|num_deleted||32-bit integer||The number of documents that were removed from the index because they were no longer found in the source.|
|num_new||32-bit integer||The number of new documents that were found in the source and added to the index.|
|num_updated||32-bit integer||The number of existing documents that were found in the source and updated in the index because they were modified since the last time they were indexed.|
|num_failed||32-bit integer||The number of documents from which the crawler failed to extract text.|
|num_total||32-bit integer||The total number of documents found.|
|batch_job||boolean||If false, documents found will be indexed after crawling.|
|job_id||integer||The ID of the job.|