public abstract class HeritrixLauncher extends Object
Modifier and Type | Field and Description |
---|---|
protected static int |
CRAWL_CONTROL_WAIT_PERIOD
The period to wait in seconds before checking if Heritrix has done anything.
|
Modifier | Constructor and Description |
---|---|
protected |
HeritrixLauncher(HeritrixFiles files)
Private HeritrixLauncher constructor.
|
|
HeritrixLauncher(Object... args)
Generic constructor to allow HeritrixLauncher to use any implementation of HeritrixController.
|
Modifier and Type | Method and Description |
---|---|
abstract void |
doCrawl()
Launches the crawl and monitors its progress.
|
protected Object[] |
getControllerArguments() |
protected HeritrixFiles |
getHeritrixFiles() |
static void |
makeTemplateReadyForHeritrix1(HeritrixFiles files)
Updates the diskpath value, archivefile_prefix, seedsfile, and deduplication -information.
|
void |
setupOrderfile(HeritrixFiles files) |
protected static final int CRAWL_CONTROL_WAIT_PERIOD
protected HeritrixLauncher(HeritrixFiles files) throws ArgumentNotValid
files
- Object encapsulating location of Heritrix crawldir and configuration files.ArgumentNotValid
- If either seedsfile or orderfile does not exist.public HeritrixLauncher(Object... args)
args
- the arguments to be passed to the constructor or non-static factory method of the HeritrixController
class specified in settingspublic abstract void doCrawl() throws IOFailure
IOFailure
protected HeritrixFiles getHeritrixFiles()
protected Object[] getControllerArguments()
public void setupOrderfile(HeritrixFiles files)
public static void makeTemplateReadyForHeritrix1(HeritrixFiles files) throws IOFailure
files
- Files associated with a Heritrix1 crawl-job.IOFailure
- This method prepares the orderfile used by the Heritrix crawler.
1. Verify that the template is in fact a H1HeritrixTemplate
2. alters the orderfile in the
following-way: (overriding whatever is in the orderfile)
4. if deduplication is enabled in the order.xml, it writes the absolute path of the lucene index used by the deduplication processor.
IOFailure
- - When the orderfile could not be saved to disk
When a specific element cannot be found in the document.Copyright © 2005–2015 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.