|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdk.netarkivet.harvester.harvesting.JMXHeritrixController
public class JMXHeritrixController
This implementation of the HeritrixController interface starts Heritrix as a separate process and uses JMX to communicate with it. Each instance executes exactly one process that runs exactly one crawl job.
Constructor Summary | |
---|---|
JMXHeritrixController(HeritrixFiles files)
Create a JMXHeritrixController object |
Method Summary | |
---|---|
boolean |
atFinish()
Query whether Heritrix is in a state where it can finish crawling. |
void |
beginCrawlStop()
Tell Heritrix to stop crawling. |
void |
cleanup()
Release any resources kept by the class. |
boolean |
crawlIsEnded()
Returns true if the crawl has ended, either because Heritrix finished or because we terminated it. |
int |
getActiveToeCount()
Get the number of currently active ToeThreads (crawler threads). |
int |
getCurrentProcessedKBPerSec()
Get an estimate of the rate, in kb, at which documents are currently being processed by the crawler. |
java.lang.String |
getProgressStats()
Get a human-readable set of statistics on the progress of the crawl. |
long |
getQueuedUriCount()
Get the number of URIs currently on the queue to be processed. |
void |
initialize()
Initialize a new CrawlController for executing a Heritrix crawl. |
boolean |
isPaused()
Returns true if the crawler has been paused, and thus not supposed to fetch anything. |
void |
requestCrawlStart()
Request that Heritrix start crawling. |
void |
requestCrawlStop(java.lang.String reason)
Request that crawling stops. |
java.lang.String |
toString()
Get a string that describes the current controller in terms of job ID, harvest ID and other usable information. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public JMXHeritrixController(HeritrixFiles files)
files
- Files that are used to set up Heritrix.Method Detail |
---|
public void initialize()
HeritrixController
initialize
in interface HeritrixController
HeritrixController.initialize()
public void requestCrawlStart()
HeritrixController
requestCrawlStart
in interface HeritrixController
HeritrixController.requestCrawlStart()
public boolean atFinish()
HeritrixController
atFinish
in interface HeritrixController
HeritrixController.atFinish()
public void beginCrawlStop()
HeritrixController
beginCrawlStop
in interface HeritrixController
HeritrixController.beginCrawlStop()
public int getActiveToeCount()
HeritrixController
getActiveToeCount
in interface HeritrixController
HeritrixController.getActiveToeCount()
public void requestCrawlStop(java.lang.String reason)
HeritrixController
requestCrawlStop
in interface HeritrixController
reason
- A human-readable reason the crawl is being stopped.HeritrixController.requestCrawlStop(String)
public long getQueuedUriCount()
HeritrixController
getQueuedUriCount
in interface HeritrixController
HeritrixController.getQueuedUriCount()
public int getCurrentProcessedKBPerSec()
HeritrixController
getCurrentProcessedKBPerSec
in interface HeritrixController
HeritrixController.getCurrentProcessedKBPerSec()
public java.lang.String getProgressStats()
HeritrixController
getProgressStats
in interface HeritrixController
HeritrixController.getProgressStats()
public boolean isPaused()
HeritrixController
isPaused
in interface HeritrixController
HeritrixController.isPaused()
public boolean crawlIsEnded()
crawlIsEnded
in interface HeritrixController
public void cleanup()
HeritrixController
cleanup
in interface HeritrixController
HeritrixController.cleanup()
public java.lang.String toString()
toString
in class java.lang.Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |