public class MetadataFile extends Object implements Comparable<MetadataFile>
Defines a natural order to sort them.
Modifier and Type | Field and Description |
---|---|
static String |
CDX_PATTERN
A pattern identifying a CDX metadata entry.
|
static String |
CRAWL_LOG_PATTERN
A pattern identifying the crawl log metadata entry.
|
static String |
DOMAIN_SETTINGS_FILE
The name of a domain-specific Heritrix settings file (a.k.a.
|
static String |
HERITRIX_FILE_PATTERN
The pattern controlling which files in the crawl directory root should be stored in the metadata ARC.
|
static String |
LOG_FILE_PATTERN
The pattern controlling which files in the logs subdirectory of the crawl directory root should be stored in the
metadata ARC as log files.
|
static String |
RECOVER_LOG_PATTERN
A pattern identifying the recover.gz log metadata entry.
|
static String |
REPORT_FILE_PATTERN
The pattern controlling which files in the crawl directory root should be stored in the metadata ARC as reports.
|
Constructor and Description |
---|
MetadataFile(File heritrixFile,
Long harvestId,
Long jobId,
String heritrixVersion)
Creates a metadata file and finds which metadata type it belongs to.
|
MetadataFile(File heritrixFile,
Long harvestId,
Long jobId,
String heritrixVersion,
String domain)
Creates a metadata file for a domain-specific override file.
|
Modifier and Type | Method and Description |
---|---|
int |
compareTo(MetadataFile other)
First we compare the type ordinals, then the URLs.
|
File |
getHeritrixFile()
Returns the actual file.
|
String |
getUrl() |
public static final String CDX_PATTERN
CDXDataCache.CDXDataCache()
,
Constant Field Valuespublic static final String CRAWL_LOG_PATTERN
#CrawlLogDataCache()
,
Constant Field Valuespublic static final String RECOVER_LOG_PATTERN
public static final String HERITRIX_FILE_PATTERN
public static final String REPORT_FILE_PATTERN
public static final String LOG_FILE_PATTERN
public static final String DOMAIN_SETTINGS_FILE
public MetadataFile(File heritrixFile, Long harvestId, Long jobId, String heritrixVersion)
public MetadataFile(File heritrixFile, Long harvestId, Long jobId, String heritrixVersion, String domain)
heritrixFile
- a given heritrix metadata file.harvestId
- The harvestID that the job generating this file is part of.jobId
- The Id of the job generating this fileheritrixVersion
- the version of Heritrix generating the filedomain
- The name of the domain, this metadata belongs topublic File getHeritrixFile()
public int compareTo(MetadataFile other)
compareTo
in interface Comparable<MetadataFile>
Copyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.