public interface IHeritrixController
Modifier and Type | Method and Description |
---|---|
boolean |
atFinish()
Query whether Heritrix is in a state where it can finish crawling.
|
void |
beginCrawlStop()
Tell Heritrix to stop crawling.
|
void |
cleanup()
Release any resources kept by the class.
|
boolean |
crawlIsEnded()
Returns true if the crawl has ended, either because Heritrix finished or because we terminated it.
|
int |
getActiveToeCount()
Get the number of currently active ToeThreads (crawler threads).
|
int |
getCurrentProcessedKBPerSec()
Get an estimate of the rate, in kb, at which documents are currently being processed by the crawler.
|
String |
getHarvestInformation()
Get harvest information.
|
String |
getProgressStats()
Get a human-readable set of statistics on the progress of the crawl.
|
long |
getQueuedUriCount()
Get the number of URIs currently on the queue to be processed.
|
void |
initialize()
Initialize a new CrawlController for executing a Heritrix crawl.
|
boolean |
isPaused()
Returns true if the crawler has been paused, and thus not supposed to fetch anything.
|
void |
requestCrawlStart()
Request that Heritrix start crawling.
|
void |
requestCrawlStop(String reason)
Request that the crawler stops.
|
void |
stopHeritrix()
Stop the heritrix process.
|
void initialize()
void requestCrawlStart() throws IOFailure
IOFailure
- If something goes wrong during startup.void beginCrawlStop()
void requestCrawlStop(String reason)
reason
- A human-readable reason the crawl is being stopped.boolean atFinish()
boolean crawlIsEnded()
int getActiveToeCount()
long getQueuedUriCount()
int getCurrentProcessedKBPerSec()
StatisticsTracking.currentProcessedKBPerSec()
String getProgressStats()
boolean isPaused()
void cleanup()
String getHarvestInformation()
void stopHeritrix()
Copyright © 2005–2015 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.