|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object dk.netarkivet.common.utils.batch.FileBatchJob dk.netarkivet.common.utils.arc.ARCBatchJob dk.netarkivet.wayback.batch.ExtractDeduplicateCDXBatchJob
public class ExtractDeduplicateCDXBatchJob
This batch batch job takes deduplication records from a crawl log in a metadata arcfile and converts them to cdx records for use in wayback.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class dk.netarkivet.common.utils.batch.FileBatchJob |
---|
FileBatchJob.ExceptionOccurrence |
Field Summary |
---|
Fields inherited from class dk.netarkivet.common.utils.arc.ARCBatchJob |
---|
noOfRecordsProcessed |
Fields inherited from class dk.netarkivet.common.utils.batch.FileBatchJob |
---|
batchJobTimeout, exceptions, filesFailed, noOfFilesProcessed |
Constructor Summary | |
---|---|
ExtractDeduplicateCDXBatchJob()
|
Method Summary | |
---|---|
void |
finish(java.io.OutputStream os)
Does nothing |
void |
initialize(java.io.OutputStream os)
Initializes various fields of this class |
void |
processRecord(org.archive.io.arc.ARCRecord record,
java.io.OutputStream os)
If the ARCRecord is a crawl-log entry then any duplicate entries in the crawl log are converted to CDX entries and written to the output. |
Methods inherited from class dk.netarkivet.common.utils.arc.ARCBatchJob |
---|
getExceptionArray, getFilter, handleException, noOfRecordsProcessed, processFile |
Methods inherited from class dk.netarkivet.common.utils.batch.FileBatchJob |
---|
addException, addFinishException, addInitializeException, getBatchJobTimeout, getExceptions, getFilenamePattern, getFilesFailed, getNoOfFilesProcessed, maxExceptionsReached, postProcess, processOnlyFileNamed, processOnlyFilesMatching, processOnlyFilesMatching, processOnlyFilesNamed, setBatchJobTimeout |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public ExtractDeduplicateCDXBatchJob()
Method Detail |
---|
public void initialize(java.io.OutputStream os)
initialize
in class ARCBatchJob
os
- unused parameterpublic void processRecord(org.archive.io.arc.ARCRecord record, java.io.OutputStream os)
processRecord
in class ARCBatchJob
record
- The ARCRecord to be processedos
- the stream to which output is writtenpublic void finish(java.io.OutputStream os)
finish
in class ARCBatchJob
os
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |