public abstract class WARCBatchFilter extends Object implements Serializable
Modifier and Type | Field and Description |
---|---|
static WARCBatchFilter |
EXCLUDE_NON_RESPONSE_RECORDS
A default filter: Accepts on response records.
|
static WARCBatchFilter |
NO_FILTER
A default filter: Accepts everything.
|
static WARCBatchFilter |
ONLY_HTTP_ENTRIES
Filter that only accepts records where the url starts with http.
|
Modifier | Constructor and Description |
---|---|
protected |
WARCBatchFilter(String name)
Create a new filter with the given name.
|
Modifier and Type | Method and Description |
---|---|
abstract boolean |
accept(org.archive.io.warc.WARCRecord record)
Check if a given record is accepted (not filtered out) by this filter.
|
static WARCBatchFilter |
getMimetypeBatchFilter(String mimetype)
Note that the mimetype of the WARC responserecord is not (necessarily) the same as its payload.
|
protected String |
getName()
Get the name of the filter.
|
static boolean |
mimetypeIsOk(String mimetype)
Check, if a certain mimetype is valid.
|
public static final WARCBatchFilter NO_FILTER
public static final WARCBatchFilter EXCLUDE_NON_RESPONSE_RECORDS
public static final WARCBatchFilter ONLY_HTTP_ENTRIES
protected WARCBatchFilter(String name)
name
- The name of this filter, for debugging mostly.public static WARCBatchFilter getMimetypeBatchFilter(String mimetype) throws MimeTypeParseException
mimetype
- String denoting the mimetype this filter representsMimeTypeParseException
- If mimetype is invalidpublic static boolean mimetypeIsOk(String mimetype)
mimetype
- a given mimetypepublic abstract boolean accept(org.archive.io.warc.WARCRecord record)
record
- a given WARCRecordCopyright © 2005–2015 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.