dk.netarkivet.harvester.harvesting.metadata
Class MetadataFileWriter

java.lang.Object
  extended by dk.netarkivet.harvester.harvesting.metadata.MetadataFileWriter
Direct Known Subclasses:
MetadataFileWriterArc, MetadataFileWriterWarc

public abstract class MetadataFileWriter
extends java.lang.Object

Abstract base class for Metadata file writer. Implementations must extend this class.


Field Summary
protected static int MDF_ARC
          Constant representing the ARC format.
protected static int MDF_WARC
          Constant representing the WARC format.
protected static int metadataFormat
          Constant representing the metadata Format.
 
Constructor Summary
MetadataFileWriter()
           
 
Method Summary
abstract  void close()
          Close the metadatafile Writer.
static MetadataFileWriter createWriter(java.io.File metadataArchiveFile)
          Create a writer that writes data to the given archive file.
abstract  java.io.File getFile()
           
static java.lang.String getMetadataArchiveFileName(java.lang.String jobID, java.lang.Long harvestID)
          Generates a name for an archive(ARC/WARC) file containing metadata regarding a given job.
protected static void initializeMetadataFormat()
          Initialize the used metadata format from settings.
 void insertFiles(java.io.File parentDir, java.io.FilenameFilter filter, java.lang.String mimetype, IngestableFiles files)
          Append the files contained in the directory to the metadata archive file, but only if the filename matches the supplied filter.
static void resetMetadataFormat()
          Reset the metadata format.
abstract  void write(java.lang.String uri, java.lang.String contentType, java.lang.String hostIP, long fetchBeginTimeStamp, byte[] payload)
          Write a record to the archive file.
abstract  void writeFileTo(java.io.File file, java.lang.String uri, java.lang.String mime)
          Write the given file to the metadata file.
abstract  boolean writeTo(java.io.File fileToArchive, java.lang.String URL, java.lang.String mimetype)
          Writes a File to an ARCWriter, if available, otherwise logs the failure to the class-logger.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MDF_ARC

protected static final int MDF_ARC
Constant representing the ARC format.

See Also:
Constant Field Values

MDF_WARC

protected static final int MDF_WARC
Constant representing the WARC format.

See Also:
Constant Field Values

metadataFormat

protected static int metadataFormat
Constant representing the metadata Format. Recognized formats are either MDF_ARC or MDF_WARC

Constructor Detail

MetadataFileWriter

public MetadataFileWriter()
Method Detail

initializeMetadataFormat

protected static void initializeMetadataFormat()
Initialize the used metadata format from settings.


getMetadataArchiveFileName

public static java.lang.String getMetadataArchiveFileName(java.lang.String jobID,
                                                          java.lang.Long harvestID)
                                                   throws ArgumentNotValid
Generates a name for an archive(ARC/WARC) file containing metadata regarding a given job.

Parameters:
jobID - The number of the job that generated the archive file.
harvestID - the harvest ID of the job. Can be null.
Returns:
A "flat" file name (i.e. no path) containing the jobID parameter and ending on "-metadata-N.(w)arc", where N is the serial number of the metadata files for this job, e.g. "42-metadata-1.(w)arc". Currently, only one file is ever made.
Throws:
ArgumentNotValid - if any parameter was null.

createWriter

public static MetadataFileWriter createWriter(java.io.File metadataArchiveFile)
Create a writer that writes data to the given archive file.

Parameters:
metadataArchiveFile - The archive file to write to.
Returns:
a writer that writes data to the given archive file.

close

public abstract void close()
Close the metadatafile Writer.


getFile

public abstract java.io.File getFile()
Returns:
the finished metadataFile

writeFileTo

public abstract void writeFileTo(java.io.File file,
                                 java.lang.String uri,
                                 java.lang.String mime)
Write the given file to the metadata file.

Parameters:
file - A given file with metadata to write to the metadata archive file.
uri - The uri associated with the piece of metadata
mime - The mimetype associated with the piece of metadata

writeTo

public abstract boolean writeTo(java.io.File fileToArchive,
                                java.lang.String URL,
                                java.lang.String mimetype)
Writes a File to an ARCWriter, if available, otherwise logs the failure to the class-logger.

Parameters:
fileToArchive - the File to archive
URL - the URL with which it is stored in the arcfile
mimetype - The mimetype of the File-contents
Returns:
true, if file exists, and is written to the arcfile.

write

public abstract void write(java.lang.String uri,
                           java.lang.String contentType,
                           java.lang.String hostIP,
                           long fetchBeginTimeStamp,
                           byte[] payload)
                    throws java.io.IOException
Write a record to the archive file.

Parameters:
uri - record URI
contentType - content-type of record
hostIP - resource ip-address
fetchBeginTimeStamp - record datetime
payload - A byte array containing the payload
Throws:
java.io.IOException
See Also:
ARCWriter.write(String uri, String contentType, String hostIP, long fetchBeginTimeStamp, long recordLength, InputStream in)

insertFiles

public void insertFiles(java.io.File parentDir,
                        java.io.FilenameFilter filter,
                        java.lang.String mimetype,
                        IngestableFiles files)
Append the files contained in the directory to the metadata archive file, but only if the filename matches the supplied filter.

Parameters:
parentDir - directory containing the files to append to metadata
filter - filter describing which files to accept and which to ignore
mimetype - The content-type to write along with the files in the metadata output

resetMetadataFormat

public static void resetMetadataFormat()
Reset the metadata format. Should only be used by a unittest.