|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdk.netarkivet.harvester.harvesting.IngestableFiles
public class IngestableFiles
Encapsulation of files to be ingested into the archive. These files are presently placed subdirectories under the crawldir.
Field Summary | |
---|---|
static java.lang.String |
METADATA_FILENAME_FORMAT
|
protected static java.lang.String |
METADATA_SUB_DIR
Subdir with final metadata file in it. |
Constructor Summary | |
---|---|
IngestableFiles(HeritrixFiles files)
Constructor for this class. |
Method Summary | |
---|---|
void |
cleanup()
Remove any temporary files. |
void |
closeOpenFiles(int waitSeconds)
Close any ".open" files left by a crashed Heritrix. |
protected void |
closeOpenFiles(java.lang.String archiveDirName,
java.io.FilenameFilter filter)
Given an archive sub-directory name and a filter to match against this method tries to rename the matched files. |
java.util.List<java.io.File> |
getArcFiles()
Get a list of all ARC files that should get ingested. |
java.io.File |
getArcsDir()
|
java.io.File |
getCrawlDir()
|
long |
getHarvestID()
|
java.lang.String |
getHarvestnamePrefix()
|
long |
getJobId()
|
java.util.List<java.io.File> |
getMetadataArcFiles()
Gets the files containing the metadata. |
protected java.io.File |
getMetadataFile()
Constructs the single metadata arc file from the crawlDir and the jobID. |
MetadataFileWriter |
getMetadataWriter()
Get a MetaDatafileWriter for the temporary metadata file. |
java.io.File |
getTmpMetadataDir()
Constructs the TEMPORARY metadata subdir from the crawlDir. |
java.util.List<java.io.File> |
getWarcFiles()
Get a list of all WARC files that should get ingested. |
java.io.File |
getWarcsDir()
|
boolean |
isMetadataFailed()
Return true if the metadata generation process is known to have failed. |
boolean |
isMetadataReady()
Check, if the metadatafile already exists. |
void |
setMetadataGenerationSucceeded(boolean success)
Marks generated metadata as final, closes the writer, and moves the temporary metadata file to its final position, if successful. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final java.lang.String METADATA_SUB_DIR
public static final java.lang.String METADATA_FILENAME_FORMAT
Constructor Detail |
---|
public IngestableFiles(HeritrixFiles files)
files
- An instance of HeritrixFiles
ArgumentNotValid
- if null-arguments are given;
if jobID < 1;
if crawlDir does not existMethod Detail |
---|
public boolean isMetadataReady()
public boolean isMetadataFailed()
public void setMetadataGenerationSucceeded(boolean success)
success
- True if metadata was successfully generated, false
otherwise.
PermissionDenied
- If the metadata has already been marked as
ready, or if no metadata file exists upon success.
IOFailure
- if there is an error marking the metadata as ready.public MetadataFileWriter getMetadataWriter()
PermissionDenied
- if metadata generation is already
finished.public java.util.List<java.io.File> getMetadataArcFiles()
PermissionDenied
- if the metadata file is not ready, either
because generation is still going on or there was an error generating
the metadata.protected java.io.File getMetadataFile()
public java.io.File getTmpMetadataDir()
public java.util.List<java.io.File> getArcFiles()
public java.io.File getArcsDir()
public java.io.File getWarcsDir()
public java.util.List<java.io.File> getWarcFiles()
public void closeOpenFiles(int waitSeconds)
waitSeconds
- How many seconds to wait before closing files. This
may be done in order to allow Heritrix to finish writing before we close
the files.protected void closeOpenFiles(java.lang.String archiveDirName, java.io.FilenameFilter filter)
archiveDirName
- archive directory name, currently "arc" or "warc"filter
- filename filter used to select ".open" files to renamepublic void cleanup()
public long getJobId()
public long getHarvestID()
public java.lang.String getHarvestnamePrefix()
public java.io.File getCrawlDir()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |