Java Web Archive Toolkit

This toolkit includes classes to read and validate Arc, GZip and Warc files. Arc and Warc files which are GZip compressed are also supported.

The toolkit has the following package layout:

  • jwat-common: General purpose classes including specialized streams, binary->string encoding and common arc/warc http-response/payload code.
  • jwat-gzip: GZip input-stream/entry reader/validator.
  • jwat-arc: Contains Arc reader/validator specific classes.
  • jwat-warc: Contains Warc reader/validator specific classes.

It is in java, it is beautiful, it is incomplete as of yet.

