|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdk.netarkivet.harvester.distribute.HarvesterMessageHandler
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
public class HarvestControllerServer
This class responds to JMS doOneCrawl messages from the HarvestScheduler and launches a Heritrix crawl with the received job description. The generated ARC files are uploaded to the bitarchives once a harvest job has been completed. During operation CrawlStatus messages are sent to the HarvestSchedulerMonitorServer. When starting harvesting a message is sent with status 'STARTED'. When finished a message is sent with either status 'DONE' or 'FAILED'. Either a 'DONE' or 'FAILED' message with result should ALWAYS be sent if at all possible, but only ever one such message per job. It is necessary to be able to run the Heritrix harvester on several machines and several processes on each machine. Each instance of Heritrix is started and monitored by a HarvestControllerServer. If the VM is stopped during a harvest it will, on restart, read all directories under serverdir looking for harvestinfo files. If any are found, they are parsed for information, and all remaining files are attempted uploaded to the bitarchive. It will then send a crawlstatusmessage with failed. A new thread is started for each actual crawl, in which the JMS listener is removed. Threading is required since JMS will not let the called thread remove the listener that's being handled. If we fail to start the crawl, the listener is readded, otherwise the VM is shutdown and restarted by the SideKickApplication.
Field Summary | |
---|---|
(package private) static int |
WAIT_FOR_HOSTS_REPORT_TIMEOUT_SECS
The max time to wait for the hosts-report.txt to be available (in secs). |
Method Summary | |
---|---|
void |
cleanup()
Will be called on shutdown. |
void |
close()
Release all jms connections. |
static HarvestControllerServer |
getInstance()
Returns or creates the unique instance of this singleton The server creates an instance of the HarvestController, uploads arc-files from unfinshed harvests, and starts to listen to JMS messages on the incoming jms queues. |
void |
visit(DoOneCrawlMessage msg)
Receives a DoOneCrawlMessage and call onDoOneCrawl. |
Methods inherited from class dk.netarkivet.harvester.distribute.HarvesterMessageHandler |
---|
onMessage, visit |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
static final int WAIT_FOR_HOSTS_REPORT_TIMEOUT_SECS
Method Detail |
---|
public static HarvestControllerServer getInstance() throws IOFailure
PermissionDenied
- If the serverdir or oldjobsdir can't be created
IOFailure
- if data from old harvests exist, but contain illegal datapublic void close()
public void cleanup()
cleanup
in interface CleanupIF
CleanupIF.cleanup()
public void visit(DoOneCrawlMessage msg) throws IOFailure, UnknownID, ArgumentNotValid, PermissionDenied
visit
in interface HarvesterMessageVisitor
visit
in class HarvesterMessageHandler
msg
- the message received
IOFailure
- if the crawl fails
if unable to write to harvestInfoFile
UnknownID
- if jobID is null in the message
ArgumentNotValid
- if the status of the job is not valid - must be
SUBMITTED
PermissionDenied
- if the crawldir can't be created
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |