{excerpt}
The Data Source History API returns historical statistics for previous data source runs. History only returns information about _prior_ crawls for a data source. Use [lweug20:Data Source Jobs] or [lweug20:Data Source Status] for details on currently running crawls.
{toc}
h2. API Entry Points
{{[/api/collections/_name_/datasources/_id_/history|#api1]}}{{:}} Get statistics for the last 50 runs of the given data source.
{anchor:api1}
h2. Get Data Source History
!LucidWorks REST API Reference^bullet.jpg! {{GET /api/collections/}}{{{}{_}collection{_}{}}}{{/datasources/}}{{{}{_}id{_}{}}}{{/history}}
h4. {bgcolor:#FEECC4}{*}Input{*}{bgcolor}
*Path Parameters*
Enter path parameters.
|| Key || Description ||
| collection | The collection name. |
| id | The data source ID. |
*Query Parameters*
None
h4. {bgcolor:#FEECC4}{*}Output{*}{bgcolor}
*Output Content*
|| Key || Type || Description ||
| id | integer | The ID of the datasource. |
| crawl_started | date string | When the crawl began. |
| crawl_stopped | date string | When the crawl finished. |
| crawl_state | string | The current state of the crawl (RUNNING, FINISHED, or STOPPED). |
| num_unchanged | 32-bit integer | The number of documents found that were not modified and did not need to be indexed. |
| num_deleted | 32-bit integer | The number of documents that were removed from the index because they were no longer found in the source. |
| num_new | 32-bit integer | The number of new documents that were found in the source and added to the index. |
| num_updated | 32-bit integer | The number of existing documents that were found in the source and updated in the index because they were modified since the last time they were indexed. |
| num_failed | 32-bit integer | The number of documents from which the crawler failed to extract text. |
| num_total | 32-bit integer | The total number of documents found. |
| batch_job | boolean | If *false*, documents found will be indexed after crawling. |
| job_id | integer | The ID of the job. |
h4. {bgcolor:#FEECC4}{*}Examples{*}{bgcolor}
*Input*
{code:borderStyle=solid|borderColor=#666666}
curl 'http://localhost:8888/api/collections/myCollection/datasources/8/history'
{code}
*Output*
{code:borderStyle=solid|borderColor=#666666}
[
{
"id": 2,
"crawl_started": "2011-03-17T22:16:46+0000",
"num_unchanged": 0,
"crawl_state" : "FINISHED",
"crawl_stopped": "2011-03-17T22:16:51+0000",
"job_id": "2",
"num_updated": 0 ,
"num_new": 6,
"num_failed": 0,
"num_deleted": 0,
"num_total":6,
"batch_job":false,
"job_id":3
},
{
"id": 2,
"crawl_started": "2011-03-18T03: 25:04+0000",
"num_unchanged": 0,
"crawl_state": "FINISHED",
"crawl_stopped": "2011-03-18T 03:25:12+0000",
"job_id": "2",
"num_updated": 0,
"num_new": 6,
"num_failed": 0,
"num_deleted": 0,
"num_total: 6,
"batch_job":false,
"job_id":2
}
]
{code}
{excerpt}
{scrollbar}
The Data Source History API returns historical statistics for previous data source runs. History only returns information about _prior_ crawls for a data source. Use [lweug20:Data Source Jobs] or [lweug20:Data Source Status] for details on currently running crawls.
{toc}
h2. API Entry Points
{{[/api/collections/_name_/datasources/_id_/history|#api1]}}{{:}} Get statistics for the last 50 runs of the given data source.
{anchor:api1}
h2. Get Data Source History
!LucidWorks REST API Reference^bullet.jpg! {{GET /api/collections/}}{{{}{_}collection{_}{}}}{{/datasources/}}{{{}{_}id{_}{}}}{{/history}}
h4. {bgcolor:#FEECC4}{*}Input{*}{bgcolor}
*Path Parameters*
Enter path parameters.
|| Key || Description ||
| collection | The collection name. |
| id | The data source ID. |
*Query Parameters*
None
h4. {bgcolor:#FEECC4}{*}Output{*}{bgcolor}
*Output Content*
|| Key || Type || Description ||
| id | integer | The ID of the datasource. |
| crawl_started | date string | When the crawl began. |
| crawl_stopped | date string | When the crawl finished. |
| crawl_state | string | The current state of the crawl (RUNNING, FINISHED, or STOPPED). |
| num_unchanged | 32-bit integer | The number of documents found that were not modified and did not need to be indexed. |
| num_deleted | 32-bit integer | The number of documents that were removed from the index because they were no longer found in the source. |
| num_new | 32-bit integer | The number of new documents that were found in the source and added to the index. |
| num_updated | 32-bit integer | The number of existing documents that were found in the source and updated in the index because they were modified since the last time they were indexed. |
| num_failed | 32-bit integer | The number of documents from which the crawler failed to extract text. |
| num_total | 32-bit integer | The total number of documents found. |
| batch_job | boolean | If *false*, documents found will be indexed after crawling. |
| job_id | integer | The ID of the job. |
h4. {bgcolor:#FEECC4}{*}Examples{*}{bgcolor}
*Input*
{code:borderStyle=solid|borderColor=#666666}
curl 'http://localhost:8888/api/collections/myCollection/datasources/8/history'
{code}
*Output*
{code:borderStyle=solid|borderColor=#666666}
[
{
"id": 2,
"crawl_started": "2011-03-17T22:16:46+0000",
"num_unchanged": 0,
"crawl_state" : "FINISHED",
"crawl_stopped": "2011-03-17T22:16:51+0000",
"job_id": "2",
"num_updated": 0 ,
"num_new": 6,
"num_failed": 0,
"num_deleted": 0,
"num_total":6,
"batch_job":false,
"job_id":3
},
{
"id": 2,
"crawl_started": "2011-03-18T03: 25:04+0000",
"num_unchanged": 0,
"crawl_state": "FINISHED",
"crawl_stopped": "2011-03-18T 03:25:12+0000",
"job_id": "2",
"num_updated": 0,
"num_new": 6,
"num_failed": 0,
"num_deleted": 0,
"num_total: 6,
"batch_job":false,
"job_id":2
}
]
{code}
{excerpt}
{scrollbar}