Class HarvestJob
- java.lang.Object
-
- dk.netarkivet.harvester.heritrix3.HarvestJob
-
public class HarvestJob extends Object
-
-
Constructor Summary
Constructors Constructor Description HarvestJob(HarvestControllerServer hcs)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description File
createCrawlDir()
Create the crawl dir, but make sure a message is sent if there is a problem.Heritrix3Files
getHeritrix3Files()
void
init(Job job, HarvestDefinitionInfo origHarvestInfo, List<MetadataEntry> metadataEntries)
Initialization of the harvestJob.void
runHarvest()
Creates the actual HeritrixLauncher instance and runs it, after the various setup files have been written.Heritrix3Files
writeHarvestFiles(File crawldir, Job job, HarvestDefinitionInfo hdi, List<MetadataEntry> metadataEntries)
Writes the files needed to start a harvest..
-
-
-
Constructor Detail
-
HarvestJob
public HarvestJob(HarvestControllerServer hcs)
Constructor.- Parameters:
hcs
- a HarvestControllerServer instance
-
-
Method Detail
-
init
public void init(Job job, HarvestDefinitionInfo origHarvestInfo, List<MetadataEntry> metadataEntries)
Initialization of the harvestJob.- Parameters:
job
- A job from the jobs table in the harvestdatabaseorigHarvestInfo
- metadata about the harvestmetadataEntries
- entries for the metadata file for the harvest
-
getHeritrix3Files
public Heritrix3Files getHeritrix3Files()
- Returns:
- the Heritrix3Files object initialized with the init() method.
-
runHarvest
public void runHarvest() throws ArgumentNotValid
Creates the actual HeritrixLauncher instance and runs it, after the various setup files have been written.- Throws:
ArgumentNotValid
- if an argument isn't valid.
-
createCrawlDir
public File createCrawlDir()
Create the crawl dir, but make sure a message is sent if there is a problem.- Returns:
- The directory that the crawl will take place in.
- Throws:
PermissionDenied
- if the directory cannot be created.
-
writeHarvestFiles
public Heritrix3Files writeHarvestFiles(File crawldir, Job job, HarvestDefinitionInfo hdi, List<MetadataEntry> metadataEntries)
Writes the files needed to start a harvest..- Parameters:
crawldir
- The directory that the crawl should take place in.job
- The Job object containing various harvest setup data.hdi
- The object encapsulating documentary information about the harvest.metadataEntries
- Any metadata entries sent along with the job that should be stored for later use.- Returns:
- An object encapsulating where these files have been written.
-
-