Class FileBatchJob

    • Field Detail

      • noOfFilesProcessed

        protected int noOfFilesProcessed
        The total number of files processed (including any that generated errors).
      • batchJobTimeout

        protected long batchJobTimeout
        If positiv it is the timeout of specific Batch Job in miliseconds. If numbers is negative we use standard timeout from settings.
      • filesFailed

        protected Set<File> filesFailed
        A Set of files which generated errors.
    • Constructor Detail

      • FileBatchJob

        public FileBatchJob()
    • Method Detail

      • initialize

        public abstract void initialize​(OutputStream os)
        Initialize the job before runnning. This is called before the processFile() calls. If this throws an exception, processFile() will not be called, but finish() will,
        Parameters:
        os - the OutputStream to which output should be written
      • processFile

        public abstract boolean processFile​(File file,
                                            OutputStream os)
        Process one file stored in the bit archive.
        Parameters:
        file - the file to be processed.
        os - the OutputStream to which output should be written
        Returns:
        true if the file was successfully processed, false otherwise
      • finish

        public abstract void finish​(OutputStream os)
        Finish up the job. This is called after the last process() call. If the initialize() call throws an exception, this will still be called so that any resources allocated can be cleaned up. Implementations should make sure that this method can handle a partial initialization
        Parameters:
        os - the OutputStream to which output should be written
      • processOnlyFilesNamed

        public void processOnlyFilesNamed​(List<String> specifiedFilenames)
        Mark the job to process only the specified files. This will override any previous setting of which files to process.
        Parameters:
        specifiedFilenames - A list of filenamess to process (without paths). If null, all files will be processed.
      • processOnlyFileNamed

        public void processOnlyFileNamed​(String specifiedFilename)
        Helper method for only processing one file. This will override any previous setting of which files to process.
        Parameters:
        specifiedFilename - The name of the single file that should be processed. Should not include any path information.
      • processOnlyFilesMatching

        public void processOnlyFilesMatching​(List<String> specifiedPatterns)
        Set this job to match only a certain set of patterns. This will override any previous setting of which files to process.
        Parameters:
        specifiedPatterns - The patterns of file names that this job will operate on. These should not include any path information, but should match the entire filename (e.g. .*foo.* for any file with foo in the name).
      • processOnlyFilesMatching

        public void processOnlyFilesMatching​(String specifiedPattern)
        Set this job to match only a certain pattern. This will override any previous setting of which files to process.
        Parameters:
        specifiedPattern - Regular expression of file names that this job will operate on. This should not include any path information, but should match the entire filename (e.g. .*foo.* for any file with foo in the name).
      • getFilenamePattern

        public Pattern getFilenamePattern()
        Get the pattern for files that should be processed.
        Returns:
        A pattern for files to process.
      • getNoOfFilesProcessed

        public int getNoOfFilesProcessed()
        Return the number of files processed in this job.
        Returns:
        the number of files processed in this job
      • getFilesFailed

        public Collection<File> getFilesFailed()
        Return the list of names of files where processing failed. An empty list is returned, if none failed.
        Returns:
        the possibly empty list of names of files where processing failed
      • getExceptions

        public List<FileBatchJob.ExceptionOccurrence> getExceptions()
        Get the list of exceptions that have occurred during processing.
        Returns:
        List of exceptions together with information on where they happened.
      • postProcess

        public boolean postProcess​(InputStream input,
                                   OutputStream output)
        Processes the concatenated result files. This is intended to be overridden by batchjobs, who they wants a different post-processing process than concatenation.
        Parameters:
        input - The inputstream to the file containing the concatenated results.
        output - The outputstream where the resulting data should be written.
        Returns:
        Whether it actually does any post processing. If false is returned then the default concatenated result file is returned.
        Throws:
        ArgumentNotValid - If the concatenated file is null.
      • addException

        protected void addException​(File currentFile,
                                    long currentOffset,
                                    long outputOffset,
                                    Exception e)
        Record an exception that occurred during the processFile of this job and that should be returned with the result. If maxExceptionsReached() returns true, this method silently does nothing.
        Parameters:
        currentFile - The file that is currently being processed.
        currentOffset - The relevant offset into the file when the exception happened (e.g. the start of an ARC record).
        outputOffset - The offset we were at in the outputstream when the exception happened. If UNKNOWN_OFFSET, the offset could not be found.
        e - The exception thrown. This exception must be serializable.
      • addInitializeException

        protected void addInitializeException​(long outputOffset,
                                              Exception e)
        Record an exception that occurred during the initialize() method of this job.
        Parameters:
        outputOffset - The offset we were at in the outputstream when the exception happened. If UNKNOWN_OFFSET, the offset could not be found.
        e - The exception thrown. This exception must be serializable.
      • addFinishException

        protected void addFinishException​(long outputOffset,
                                          Exception e)
        Record an exception that occurred during the finish() method of this job.
        Parameters:
        outputOffset - The offset we were at in the outputstream when the exception happened. If UNKNOWN_OFFSET, the offset could not be found.
        e - The exception thrown. This exception must be serializable.
      • getBatchJobTimeout

        public long getBatchJobTimeout()
        Getter for batchJobTimeout. If the batchjob has not defined a maximum time (thus set the value to -1) then the default value from settings are used.
        Returns:
        timeout in miliseconds.
      • maxExceptionsReached

        protected boolean maxExceptionsReached()
        Returns true if we have already recorded the maximum number of exceptions. At this point, no more exceptions will be recorded, and processing should be aborted.
        Returns:
        True if the maximum number of exceptions (MAX_EXCEPTIONS) has been recorded already.
      • setBatchJobTimeout

        public void setBatchJobTimeout​(long batchJobTimeout)
        Override predefined timeout period for batchjob.
        Parameters:
        batchJobTimeout - timout period