dk.netarkivet.viewerproxy.webinterface
Class HarvestedUrlsForDomainBatchJob
java.lang.Object
dk.netarkivet.common.utils.batch.FileBatchJob
dk.netarkivet.common.utils.archive.ArchiveBatchJobBase
dk.netarkivet.common.utils.archive.ArchiveBatchJob
dk.netarkivet.viewerproxy.webinterface.HarvestedUrlsForDomainBatchJob
- All Implemented Interfaces:
- java.io.Serializable
public class HarvestedUrlsForDomainBatchJob
- extends ArchiveBatchJob
Batchjob that extracts lines referring to a specific domain from a crawl log.
The batch job should be restricted to run on metadata files for a specific
job only, using the FileBatchJob.processOnlyFilesMatching(String)
construct.
- See Also:
- Serialized Form
Field Summary |
(package private) java.lang.String |
domain
The domain to extract crawl.log lines for. |
Method Summary |
void |
finish(java.io.OutputStream os)
Does nothing, no finishing is needed. |
ArchiveBatchFilter |
getFilter()
Returns an ArchiveBatchFilter object which restricts the set of records in the
archive on which this batch-job is performed. |
void |
initialize(java.io.OutputStream os)
Does nothing, no initialisation is needed. |
void |
processRecord(ArchiveRecordBase record,
java.io.OutputStream os)
Process a record on crawl log concerning the given domain to result. |
java.lang.String |
toString()
Humanly readable representation of this instance. |
Methods inherited from class dk.netarkivet.common.utils.batch.FileBatchJob |
addException, addFinishException, addInitializeException, getBatchJobTimeout, getExceptions, getFilenamePattern, getFilesFailed, getNoOfFilesProcessed, maxExceptionsReached, postProcess, processOnlyFileNamed, processOnlyFilesMatching, processOnlyFilesMatching, processOnlyFilesNamed, setBatchJobTimeout |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
domain
final java.lang.String domain
- The domain to extract crawl.log lines for.
HarvestedUrlsForDomainBatchJob
public HarvestedUrlsForDomainBatchJob(java.lang.String domain)
- Initialise the batch job.
- Parameters:
domain
- The domain to get crawl.log lines for.
initialize
public void initialize(java.io.OutputStream os)
- Does nothing, no initialisation is needed.
- Specified by:
initialize
in class ArchiveBatchJobBase
- Parameters:
os
- Not used.
getFilter
public ArchiveBatchFilter getFilter()
- Description copied from class:
ArchiveBatchJob
- Returns an ArchiveBatchFilter object which restricts the set of records in the
archive on which this batch-job is performed. The default value is
a neutral filter which allows all records.
- Overrides:
getFilter
in class ArchiveBatchJob
- Returns:
- A filter telling which records should be given to processRecord().
processRecord
public void processRecord(ArchiveRecordBase record,
java.io.OutputStream os)
- Process a record on crawl log concerning the given domain to result.
- Specified by:
processRecord
in class ArchiveBatchJob
- Parameters:
record
- The record to process.os
- The output stream for the result.
- Throws:
ArgumentNotValid
- on null parameters
IOFailure
- on trouble processing the record.
finish
public void finish(java.io.OutputStream os)
- Does nothing, no finishing is needed.
- Specified by:
finish
in class ArchiveBatchJobBase
- Parameters:
os
- Not used.
toString
public java.lang.String toString()
- Humanly readable representation of this instance.
- Overrides:
toString
in class java.lang.Object
- Returns:
- The class content.