Class WARCExtractCDX


  • public class WARCExtractCDX
    extends Object
    Command line tool for extracting CDX information from given WARC files.

    Usage: java dk.netarkivet.common.tools.ExtractCDX file1.ext [file2.ext ...] > myindex.cdx

    "ext" can be warc or warc.gz

    Note: Does not depend on logging - communicates failures on stderr.

    • Constructor Detail

      • WARCExtractCDX

        public WARCExtractCDX()
    • Method Detail

      • main

        public static void main​(String[] argv)
        Main method. Extracts CDX from all given files and outputs the index on stdout.
        Parameters:
        argv - A list of (absolute paths to) files to index.