dk.netarkivet.common.utils.arc
Class FileBatchJob

java.lang.Object
  extended by dk.netarkivet.common.utils.arc.FileBatchJob
All Implemented Interfaces:
java.io.Serializable
Direct Known Subclasses:
ARCBatchJob, ChecksumJob, FileListJob

public abstract class FileBatchJob
extends java.lang.Object
implements java.io.Serializable

Interface defining a batch job to run on a set of files. The job is initialized by calling initialize(), executed on a file by calling processFile() and any cleanup is handled by finish().

See Also:
Serialized Form

Field Summary
protected  java.util.Set<java.io.File> filesFailed
           
protected  int noOfFilesProcessed
           
 
Constructor Summary
FileBatchJob()
           
 
Method Summary
abstract  void finish(java.io.OutputStream os)
          Finish up the job.
 java.util.regex.Pattern getFilenamePattern()
          Get the pattern for files that should be processed.
 java.util.Collection<java.io.File> getFilesFailed()
          * Returns the list of names of ARC-files where processing (of one or more ARC records) failed or an empty list if none failed
 int getNoOfFilesProcessed()
          Returns the number of ARC-files processed in this job (at this bit archive application).
abstract  void initialize(java.io.OutputStream os)
          Initialize the job before runnning.
abstract  boolean processFile(java.io.File file, java.io.OutputStream os)
          Process one file stored in the bit archive.
 void processOnlyFileNamed(java.lang.String specifiedFilename)
          Helper method for only processing one file.
 void processOnlyFilesMatching(java.util.List<java.lang.String> specifiedPatterns)
          Set this job to match only a certain set of patterns.
 void processOnlyFilesMatching(java.lang.String specifiedPattern)
          Set this job to match only a certain pattern.
 void processOnlyFilesNamed(java.util.List<java.lang.String> specifiedFilenames)
          Mark the job to process only the specified files.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

noOfFilesProcessed

protected int noOfFilesProcessed

filesFailed

protected java.util.Set<java.io.File> filesFailed
Constructor Detail

FileBatchJob

public FileBatchJob()
Method Detail

initialize

public abstract void initialize(java.io.OutputStream os)
Initialize the job before runnning. This is called before the processFile() calls

Parameters:
os - the OutputStream to which output should be written

processFile

public abstract boolean processFile(java.io.File file,
                                    java.io.OutputStream os)
Process one file stored in the bit archive.

Parameters:
file - the file to be processed.
os - the OutputStream to which output should be written
Returns:
true if the file was successfully processed, false otherwise

finish

public abstract void finish(java.io.OutputStream os)
Finish up the job. This is called after the last process() call.

Parameters:
os - the OutputStream to which output should be written

processOnlyFilesNamed

public void processOnlyFilesNamed(java.util.List<java.lang.String> specifiedFilenames)
Mark the job to process only the specified files. This will override any previous setting of which files to process.

Parameters:
specifiedFilenames - A list of filenamess to process (without paths). If null, all files will be processed.

processOnlyFileNamed

public void processOnlyFileNamed(java.lang.String specifiedFilename)
Helper method for only processing one file. This will override any previous setting of which files to process.

Parameters:
specifiedFilename - The name of the single file that should be processed. Should not include any path information.

processOnlyFilesMatching

public void processOnlyFilesMatching(java.util.List<java.lang.String> specifiedPatterns)
Set this job to match only a certain set of patterns. This will override any previous setting of which files to process.

Parameters:
specifiedPatterns - The patterns of file names that this job will operate on. These should not include any path information, but should match the entire filename (e.g. .*foo.* for any file with foo in the name).

processOnlyFilesMatching

public void processOnlyFilesMatching(java.lang.String specifiedPattern)
Set this job to match only a certain pattern. This will override any previous setting of which files to process.

Parameters:
specifiedPattern - Regular expression of file names that this job will operate on. This should not include any path information, but should match the entire filename (e.g. .*foo.* for any file with foo in the name).

getFilenamePattern

public java.util.regex.Pattern getFilenamePattern()
Get the pattern for files that should be processed.

Returns:
A pattern for files to process.

getNoOfFilesProcessed

public int getNoOfFilesProcessed()
Returns the number of ARC-files processed in this job (at this bit archive application).

Returns:
the number of ARC-files processed in this job

getFilesFailed

public java.util.Collection<java.io.File> getFilesFailed()
* Returns the list of names of ARC-files where processing (of one or more ARC records) failed or an empty list if none failed

Returns:
the possibly empty list of names of ARC-files where processing (of one or more ARC records) failed