Class PostProcessing
- java.lang.Object
-
- dk.netarkivet.harvester.heritrix3.PostProcessing
-
public class PostProcessing extends Object
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
cleanup()
Clean up this singleton, releasing the ArcRepositoryClient and removing the instance.void
doPostProcessing(File crawlDir, Throwable crawlException)
Do postprocessing of data in a crawldir. 1.static PostProcessing
getInstance(JMSConnection jmsConnection)
Get the instance of the singleton HarvestController.void
processOldJobs()
Looks for old job directories that await uploading of data.
-
-
-
Method Detail
-
getInstance
public static PostProcessing getInstance(JMSConnection jmsConnection)
Get the instance of the singleton HarvestController.- Returns:
- The singleton instance.
-
cleanup
public void cleanup()
Clean up this singleton, releasing the ArcRepositoryClient and removing the instance. This instance should not be used after this method has been called. After this has been called, new calls to getInstance will return a new instance.
-
processOldJobs
public void processOldJobs()
Looks for old job directories that await uploading of data. The existence of the harvestInfo.xml in the
-
doPostProcessing
public void doPostProcessing(File crawlDir, Throwable crawlException) throws IOFailure
Do postprocessing of data in a crawldir. 1. Retrieve jobID, and crawlDir from the harvestInfoFile using class PersistentJobData 2. finds JobId and arcsdir 3. calls storeArcFiles 4. moves harvestdir to oldjobs and deletes crawl.log and other superfluous files.- Parameters:
crawlDir
- The location of harvest-info to be processedcrawlException
- any exceptions thrown by the crawl which need to be reported back to the scheduler (may be null for success)- Throws:
IOFailure
- if the harvestInfo.xml file cannot be read
-
-