dk.netarkivet.common.utils
Class WARCUtils

java.lang.Object
  extended by dk.netarkivet.common.utils.WARCUtils

public class WARCUtils
extends java.lang.Object

Various utilities on WARC-records. We have borrowed code from wayback.

See Also:
org.archive.wayback.resourcestore.indexer.WARCRecordToSearchResultAdapter.java

Field Summary
protected static org.apache.commons.logging.Log log
          Logging output place.
 
Constructor Summary
WARCUtils()
           
 
Method Summary
static java.lang.String getRecordType(org.archive.io.warc.WARCRecord record)
          Find out what type of WARC-record this is.
static byte[] readWARCRecord(org.archive.io.warc.WARCRecord record)
          Read the contents (payload) of an WARC record into a byte array.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.commons.logging.Log log
Logging output place.

Constructor Detail

WARCUtils

public WARCUtils()
Method Detail

readWARCRecord

public static byte[] readWARCRecord(org.archive.io.warc.WARCRecord record)
                             throws IOFailure
Read the contents (payload) of an WARC record into a byte array.

Parameters:
record - An WARC record to read from. After reading, the WARC Record will no longer have its own data available for reading.
Returns:
A byte array containing the payload of the WARC record. Note that the size of the payload is calculated by subtracting the contentBegin value from the length of the record (both values included in the record header).
Throws:
IOFailure - If there is an error reading the data, or if the record is longer than Integer.MAX_VALUE (since we can't make bigger arrays).

getRecordType

public static java.lang.String getRecordType(org.archive.io.warc.WARCRecord record)
Find out what type of WARC-record this is.

Parameters:
record - a given WARCRecord
Returns:
the type of WARCRecord as a String.