Package | Description |
---|---|
dk.netarkivet.common.utils.batch | |
dk.netarkivet.common.utils.cdx | |
dk.netarkivet.common.utils.warc | |
dk.netarkivet.wayback.batch |
Modifier and Type | Field and Description |
---|---|
static WARCBatchFilter |
WARCBatchFilter.EXCLUDE_NON_RESPONSE_RECORDS
A default filter: Accepts on response records.
|
static WARCBatchFilter |
WARCBatchFilter.NO_FILTER
A default filter: Accepts everything.
|
static WARCBatchFilter |
WARCBatchFilter.ONLY_HTTP_ENTRIES
Filter that only accepts records where the url starts with http.
|
Modifier and Type | Method and Description |
---|---|
static WARCBatchFilter |
WARCBatchFilter.getMimetypeBatchFilter(String mimetype)
Note that the mimetype of the WARC responserecord is not (necessarily) the same as its payload.
|
Modifier and Type | Method and Description |
---|---|
WARCBatchFilter |
WARCExtractCDXJob.getFilter()
Filters out the NON-RESPONSE records.
|
Modifier and Type | Method and Description |
---|---|
WARCBatchFilter |
WARCBatchJob.getFilter()
returns a BatchFilter object which restricts the set of warc records in the archive on which this batch-job is
performed.
|
Modifier and Type | Method and Description |
---|---|
WARCBatchFilter |
WaybackCDXExtractionWARCBatchJob.getFilter()
Set the filter, so only response records are currently processed.
|
Copyright © 2005–2015 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.