dk.netarkivet.common.utils.batch
Class ArchiveBatchFilter

java.lang.Object
  extended by dk.netarkivet.common.utils.batch.ArchiveBatchFilter
All Implemented Interfaces:
java.io.Serializable

public abstract class ArchiveBatchFilter
extends java.lang.Object
implements java.io.Serializable

A filter class for batch entries. Allows testing whether or not to process an entry without loading the entry data first. accept() is given an ArchiveRecord to avoid unnecessary reading and copying of data of records not accepted by filter.

See Also:
Serialized Form

Field Summary
static ArchiveBatchFilter EXCLUDE_NON_RESPONSE_RECORDS
          A default filter: Accepts only response records.
static ArchiveBatchFilter EXCLUDE_NON_WARCINFO_RECORDS
          A default filter: Accepts only response records.
protected  java.lang.String name
          The name of the BatchFilter.
static ArchiveBatchFilter NO_FILTER
          A default filter: Accepts everything.
static ArchiveBatchFilter ONLY_HTTP_ENTRIES
          Filter that only accepts records where the url starts with http.
 
Constructor Summary
protected ArchiveBatchFilter(java.lang.String name)
          Create a new filter with the given name.
 
Method Summary
abstract  boolean accept(ArchiveRecordBase record)
          Check if a given record is accepted (not filtered out) by this filter.
static ArchiveBatchFilter getMimetypeBatchFilter(java.lang.String mimetype)
          Note that the mimetype of the WARC responserecord is not (necessarily) the same as its payload.
protected  java.lang.String getName()
          Get the name of the filter.
static boolean mimetypeIsOk(java.lang.String mimetype)
          Check, if a certain mimetype is valid
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

name

protected java.lang.String name
The name of the BatchFilter.


NO_FILTER

public static final ArchiveBatchFilter NO_FILTER
A default filter: Accepts everything.


EXCLUDE_NON_RESPONSE_RECORDS

public static final ArchiveBatchFilter EXCLUDE_NON_RESPONSE_RECORDS
A default filter: Accepts only response records.


EXCLUDE_NON_WARCINFO_RECORDS

public static final ArchiveBatchFilter EXCLUDE_NON_WARCINFO_RECORDS
A default filter: Accepts only response records.


ONLY_HTTP_ENTRIES

public static final ArchiveBatchFilter ONLY_HTTP_ENTRIES
Filter that only accepts records where the url starts with http.

Constructor Detail

ArchiveBatchFilter

protected ArchiveBatchFilter(java.lang.String name)
Create a new filter with the given name.

Parameters:
name - The name of this filter, for debugging mostly.
Method Detail

getName

protected java.lang.String getName()
Get the name of the filter.

Returns:
the name of the filter.

accept

public abstract boolean accept(ArchiveRecordBase record)
Check if a given record is accepted (not filtered out) by this filter.

Parameters:
record - a given archive record
Returns:
true, if the given archive record is accepted by this filter

getMimetypeBatchFilter

public static ArchiveBatchFilter getMimetypeBatchFilter(java.lang.String mimetype)
                                                 throws java.awt.datatransfer.MimeTypeParseException
Note that the mimetype of the WARC responserecord is not (necessarily) the same as its payload.

Parameters:
mimetype - String denoting the mimetype this filter represents
Returns:
a BatchFilter that filters out all ARCRecords, that does not have this mimetype
Throws:
java.awt.datatransfer.MimeTypeParseException - (if mimetype is invalid)

mimetypeIsOk

public static boolean mimetypeIsOk(java.lang.String mimetype)
Check, if a certain mimetype is valid

Parameters:
mimetype -
Returns:
boolean true, if mimetype matches word/word, otherwise false