Welcome to the Java Web Archive Toolkit

This wiki describes the overall packages and also includes some detail on how the main classes are implemented.

The JWAT code was originally intended for use only in a number of JHove2 modules, but since the classes can be of use outside the JHove2 project, an independent project was created.

Note: Even though the repository is about 80mb, most of that is test data. The libraries themselves are very small!


  • Classes to read and validate Arc, GZip and Warc files.
  • Support for reading and validating GZip compressed Arc and Warc files.
  • (Multi-file) GZip validating decompressor/compressor.
  • GZip Input/Output streams.
  • ISO8859-1/UTF-8 validating decoder/encoder.
  • QuotedString validating decoder.
  • EncodedWords validating decoder.
  • Common classes for Base64, Base32 and Base16.
  • Various special purpose stream implementations.

Mercurial source repository
Issue tracker
Continuous integration
Browse source code
Code analysis
Maven site: TBD.