dk.netarkivet.common.utils.cdx
Class CDXUtils
java.lang.Object
dk.netarkivet.common.utils.cdx.CDXUtils
public class CDXUtils
- extends java.lang.Object
Utility class for creating CDX-files.
The CDX-format is described here:
http://www.archive.org/web/researcher/cdx_file_format.php
Method Summary |
static void |
generateCDX(ArchiveProfile archiveProfile,
java.io.File archiveFileDirectory,
java.io.File cdxFileDirectory)
Applies createCDXRecord() to all ARC/WARC files in a directory, creating
one CDX file per ARC/WARC file. |
static void |
writeCDXInfo(java.io.File archivefile,
java.io.OutputStream cdxstream)
Add cdx info for a given archive file to a given OutputStream. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CDXUtils
public CDXUtils()
writeCDXInfo
public static void writeCDXInfo(java.io.File archivefile,
java.io.OutputStream cdxstream)
- Add cdx info for a given archive file to a given OutputStream.
Note, any exceptions are logged on level FINE but otherwise ignored.
- Parameters:
archivefile
- A file with archive recordscdxstream
- An output stream to add CDX lines to
generateCDX
public static void generateCDX(ArchiveProfile archiveProfile,
java.io.File archiveFileDirectory,
java.io.File cdxFileDirectory)
throws ArgumentNotValid
- Applies createCDXRecord() to all ARC/WARC files in a directory, creating
one CDX file per ARC/WARC file.
Note, any exceptions during index generation are logged at level FINE
but otherwise ignored.
Exceptions creating any cdx file are logged at level WARNING but
otherwise ignored.
CDX files are named as the ARC/WARC files except ".(w)arc" or
".(w)arc.gz" is extended with ".cdx"
- Parameters:
archiveProfile
- archive profile including filters, patterns, etc.archiveFileDirectory
- A directory with archive files to generate
index forcdxFileDirectory
- A directory to generate CDX files in
- Throws:
ArgumentNotValid
- if any of directories are null or is not an
existing directory, or if cdxFileDirectory is not writable.