|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdk.netarkivet.harvester.harvesting.HeritrixLauncher
public abstract class HeritrixLauncher
A HeritrixLauncher object wraps around an instance of the web crawler Heritrix. The object is constructed with the necessary information to do a crawl. The crawl is performed when doOneCrawl() is called. doOneCrawl() monitors progress and returns when the crawl is finished or must be stopped because it has stalled.
Field Summary | |
---|---|
protected static int |
CRAWL_CONTROL_WAIT_PERIOD
The period to wait in seconds before checking if Heritrix has done anything. |
(package private) org.apache.commons.logging.Log |
log
The class logger. |
Constructor Summary | |
---|---|
protected |
HeritrixLauncher(HeritrixFiles files)
Private HeritrixLauncher constructor. |
|
HeritrixLauncher(java.lang.Object... args)
Generic constructor to allow HeritrixLauncher to use any implementation of HeritrixController. |
Method Summary | |
---|---|
abstract void |
doCrawl()
Launches the crawl and monitors its progress. |
protected java.lang.Object[] |
getControllerArguments()
|
protected HeritrixFiles |
getHeritrixFiles()
|
void |
setupOrderfile(HeritrixFiles files)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final int CRAWL_CONTROL_WAIT_PERIOD
final org.apache.commons.logging.Log log
Constructor Detail |
---|
protected HeritrixLauncher(HeritrixFiles files) throws ArgumentNotValid
files
- Object encapsulating location of Heritrix crawldir and
configuration files.
ArgumentNotValid
- If either seedsfile or orderfile does not
exist.public HeritrixLauncher(java.lang.Object... args)
args
- the arguments to be passed to the constructor or non-static
factory method of the HeritrixController class specified in
settingsMethod Detail |
---|
public abstract void doCrawl() throws IOFailure
IOFailure
protected HeritrixFiles getHeritrixFiles()
protected java.lang.Object[] getControllerArguments()
public void setupOrderfile(HeritrixFiles files)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |