Class MetadataEntry
- java.lang.Object
-
- dk.netarkivet.harvester.harvesting.metadata.MetadataEntry
-
- All Implemented Interfaces:
Serializable
public class MetadataEntry extends Object implements Serializable
Class used to carry metadata in DoOneCrawl messages, including the URL and mimetype necessary to write the metadata to metadata (W)ARC files.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description MetadataEntry(String url, String mimeType, String data)
Constructor for this class.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description byte[]
getData()
static List<MetadataEntry>
getMetadataFromDisk(File sourceDir)
Retrieve a list of serialized metadata entries on disk.String
getMimeType()
String
getURL()
boolean
isDuplicateReductionMetadataEntry()
Checks, if this is a duplicate reduction MetadataEntry.static MetadataEntry
makeAliasMetadataEntry(List<AliasInfo> aliases, Long origHarvestDefinitionID, int harvestNum, Long jobId)
Generate a MetadataEntry from a list of AliasInfo objects (VERSION 2) Expired aliases is skipped by this method.static MetadataEntry
makeDuplicateReductionMetadataEntry(List<Long> jobIDsForDuplicateReduction, Long origHarvestDefinitionID, int harvestNum, Long jobId)
Generate a MetadataEntry from a list of job ids for duplicate reduction.static void
storeMetadataToDisk(List<MetadataEntry> metadata, File destinationDir)
Store a list of metadata entries to disk.String
toString()
-
-
-
Constructor Detail
-
MetadataEntry
public MetadataEntry(String url, String mimeType, String data)
Constructor for this class.- Parameters:
url
- the URL assigned to this metadata (needed for it to be searchable)mimeType
- the mimeType for this metadata (normally text/plain or text/xml)data
- the metadata itself- Throws:
ArgumentNotValid
- if arguments are null or empty strings, or if argument url is not valid URL or if argument mimeType is not valid MimeType
-
-
Method Detail
-
makeAliasMetadataEntry
public static MetadataEntry makeAliasMetadataEntry(List<AliasInfo> aliases, Long origHarvestDefinitionID, int harvestNum, Long jobId)
Generate a MetadataEntry from a list of AliasInfo objects (VERSION 2) Expired aliases is skipped by this method.- Parameters:
aliases
- the list of aliases (possibly empty)origHarvestDefinitionID
- The harvestdefinition that is behind the job with the given jobIdharvestNum
- The number of the harvest that the job with the given jobid belongs tojobId
- The id of the Job, which this metadata belongs to- Returns:
- null, if the list if empty (or only consists of expired aliases), otherwise returns a MetadataEntry from a list of AliasInfo objects containing unexpired aliases.
-
makeDuplicateReductionMetadataEntry
public static MetadataEntry makeDuplicateReductionMetadataEntry(List<Long> jobIDsForDuplicateReduction, Long origHarvestDefinitionID, int harvestNum, Long jobId)
Generate a MetadataEntry from a list of job ids for duplicate reduction.- Parameters:
jobIDsForDuplicateReduction
- the list of jobids (possibly empty)origHarvestDefinitionID
- The harvestdefinition that is behind the job with the given jobIdharvestNum
- The number of the harvest that the job with the given jobid belongs tojobId
- The id of the Job, which this metadata belongs to- Returns:
- null, if the list is empty, otherwise returns a MetadataEntry from the list of jobids.
-
getData
public byte[] getData()
- Returns:
- Returns the data.
-
getMimeType
public String getMimeType()
- Returns:
- Returns the mimeType.
-
getURL
public String getURL()
- Returns:
- Returns the URL
-
isDuplicateReductionMetadataEntry
public boolean isDuplicateReductionMetadataEntry()
Checks, if this is a duplicate reduction MetadataEntry.- Returns:
- true, if this is a duplicate reduction MetadataEntry, otherwise false.
-
toString
public String toString()
-
storeMetadataToDisk
public static void storeMetadataToDisk(List<MetadataEntry> metadata, File destinationDir)
Store a list of metadata entries to disk.- Parameters:
metadata
- the given metadatadestinationDir
- the directory to store the metadata.
-
getMetadataFromDisk
public static List<MetadataEntry> getMetadataFromDisk(File sourceDir)
Retrieve a list of serialized metadata entries on disk.- Parameters:
sourceDir
- the directory where the metadata is stored.- Returns:
- the list of deserialized MetadataEntry object.
-
-