Class PostProcessing


  • public class PostProcessing
    extends Object
    • Method Detail

      • getInstance

        public static PostProcessing getInstance​(JMSConnection jmsConnection)
        Get the instance of the singleton HarvestController.
        Returns:
        The singleton instance.
      • cleanup

        public void cleanup()
        Clean up this singleton, releasing the ArcRepositoryClient and removing the instance. This instance should not be used after this method has been called. After this has been called, new calls to getInstance will return a new instance.
      • processOldJobs

        public void processOldJobs()
        Looks for old job directories that await uploading of data. The existence of the harvestInfo.xml in the
      • doPostProcessing

        public void doPostProcessing​(File crawlDir,
                                     Throwable crawlException)
                              throws IOFailure
        Do postprocessing of data in a crawldir.
        1. Retrieve jobID, and crawlDir from the harvestInfoFile using class PersistentJobData
        2. finds JobId and arcsdir
        3. calls storeArcFiles
        4. moves harvestdir to oldjobs and deletes crawl.log and other superfluous files.
        Parameters:
        crawlDir - The location of harvest-info to be processed
        crawlException - any exceptions thrown by the crawl which need to be reported back to the scheduler (may be null for success)
        Throws:
        IOFailure - if the harvestInfo.xml file cannot be read