public class HeritrixLauncher extends HeritrixLauncherAbstract
HeritrixController
. Every turn of the crawl control
loop, asks the Heritrix3 controller to generate a progress report as a CrawlProgressMessage
and then send this
message on the JMS bus to be consumed by the HarvestMonitor
instance.CRAWL_CONTROL_WAIT_PERIOD
Modifier and Type | Method and Description |
---|---|
void |
doCrawl()
Initializes an Heritrix3controller, then launches the Heritrix3 instance.
|
static HeritrixLauncher |
getInstance(Heritrix3Files files,
String jobName)
Get instance of this class.
|
getControllerArguments, getHeritrixFiles, makeTemplateReadyForHeritrix3, setupOrderfile
public static HeritrixLauncher getInstance(Heritrix3Files files, String jobName) throws ArgumentNotValid
files
- Object encapsulating location of Heritrix crawldir and configuration filesHeritrixLauncher
objectArgumentNotValid
- If either order.xml or seeds.txt does not exist, or argument files is null.public void doCrawl() throws IOFailure
HarvesterSettings.CRAWL_LOOP_WAIT_TIME
.CrawlProgressMessage
from the Heritrix controllerdoCrawl
in class HeritrixLauncherAbstract
IOFailure
Copyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.