Child pages
  • jwat-warc
Skip to end of metadata
Go to start of metadata

warc package


Most of the constants should be collected in this class, most of which are primarily for internal use.

ReaderFactory and Readers

  • This factory can be used to create the various types of readers with optional buffering. You can either get compressed or uncompressed readers. There are also methods which can auto-detect whether or not a compressed reader is required.
  • Abstract reader class which is the base for the all the readers. It also defines the options which can be set on a reader. Currently only digest options.
  • A reader implementation for reading compressed records.
  • A reader implementation for reading uncompressed records.

This class contains the record parser, fields and validation.

Auxiliary classes

  • Reading a WARC header encapsulates each line in instances of this class.
  • Parses and validated an WARC date.
  • Parses, validates and encapsulates a WARC digest header (algorithm, digest, encoding). The encoding is auto-detected and added later in the reading process.
  • Defines the different possible error types.
  • Defines an WARC validation error using a type, key and value.


  • Abstract writer class which is the base for all the writers.
  • A writer implementation prototype.
  • No labels