Class PersistentJobData

  • All Implemented Interfaces:
    JobInfo

    public class PersistentJobData
    extends Object
    implements JobInfo
    Class PersistentJobData holds information about an ongoing harvest. Presently the information is stored in a XML-file.
    • Constructor Detail

      • PersistentJobData

        public PersistentJobData​(File crawlDir)
        Constructor for class PersistentJobData.
        Parameters:
        crawlDir - The directory where the harvestInfo can be found
        Throws:
        ArgumentNotValid - if crawlDir is null or does not exist.
    • Method Detail

      • exists

        public boolean exists()
        Returns true, if harvestInfo exists in crawDir, otherwise false.
        Returns:
        true, if harvestInfo exists, otherwise false
      • existsIn

        public static boolean existsIn​(File crawlDir)
        Returns true if the given directory exists and contains a harvestInfo file.
        Parameters:
        crawlDir - A directory that may contain harvestInfo file.
        Returns:
        True if the harvestInfo file exists.
      • getHarvestInfoFile

        public static File getHarvestInfoFile​(File crawlDir)
        Returns:
        the location of the harvestInfo File in the crawlDir.
      • write

        public void write​(Job harvestJob,
                          HarvestDefinitionInfo hdi)
        Write information about given Job to XML-structure.
        Parameters:
        harvestJob - the given Job
        hdi - Information about the harvestJob.
        Throws:
        IOFailure - if any failure occurs while persisting data, or if the file has already been written.
      • getJobID

        public Long getJobID()
        Return the harvestInfo jobID.
        Specified by:
        getJobID in interface JobInfo
        Returns:
        the harvestInfo JobID
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getChannel

        public String getChannel()
        Return the job's harvest channel name.
        Returns:
        the job's harvest channel name
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getJobHarvestNum

        public int getJobHarvestNum()
        Return the job harvestNum.
        Returns:
        the job harvestNum
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getOrigHarvestDefinitionID

        public Long getOrigHarvestDefinitionID()
        Return the job origHarvestDefinitionID.
        Specified by:
        getOrigHarvestDefinitionID in interface JobInfo
        Returns:
        the job origHarvestDefinitionID
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getMaxBytesPerDomain

        public long getMaxBytesPerDomain()
        Return the job maxBytesPerDomain value.
        Returns:
        the job maxBytesPerDomain value.
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getMaxObjectsPerDomain

        public long getMaxObjectsPerDomain()
        Return the job maxObjectsPerDomain value.
        Returns:
        the job maxObjectsPerDomain value.
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getOrderXMLName

        public String getOrderXMLName()
        Return the job orderXMLName.
        Returns:
        the job orderXMLName.
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getVersion

        public String getVersion()
        Return the version of the xml.
        Returns:
        the version of the xml
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getHarvestFilenamePrefix

        public String getHarvestFilenamePrefix()
        If not set in persistentJobData, fall back to the standard way. jobid-harvestid.
        Specified by:
        getHarvestFilenamePrefix in interface JobInfo
        Returns:
        the harvestFilename prefix.
      • getharvestName

        public String getharvestName()
        Return the harvestname in this xml.
        Returns:
        the harvestname in this xml.
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getScheduleName

        public String getScheduleName()
        Return the schedulename in this xml.
        Returns:
        the schedulename in this xml (or null, if undefined for this job)
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getJobSubmitDate

        public String getJobSubmitDate()
        Return the submit date of the job in this xml.
        Returns:
        the submit date of the job in this xml.
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getPerformer

        public String getPerformer()
        Return the performer information in this xml.
        Returns:
        the performer information in this xml or null if value undefined
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.
      • getAudience

        public String getAudience()
        Return the audience information in this xml.
        Returns:
        the audience information in this xml or null if value undefined
        Throws:
        IOFailure - if no harvestInfo exists or it is invalid.