Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Added callback support to read and modify the ARC/WARC headers before processing the payload.
    • -0
    • +24
    ./arc/ArcRecordParserCallback.java
  1. … 5 more files in changeset.
CR-JWAS-33: Follow-up on review.
  1. … 61 more files in changeset.
JWAT-87: Improved detection of garbage at the ed of (W)ARC files and unit tests of this. Also added unit tests testing empty (W)ARC files.
  1. … 5 more files in changeset.
Forgot to remove some dependencies to ArcHeader in the parser.
Made some changes so ArcHeader, WarcHeader and HttpHeader can be re-parsed without using the complete Arc/Warc Readers.
  1. … 3 more files in changeset.
Made some methods public instead of protected. Various cleanup.
  1. … 2 more files in changeset.
ArcReader and WarcReader now implement Iterable<..> interface.
  1. … 3 more files in changeset.
JWAT-77: Unit tests and bug fixes for newly implemented ArcFileWriter/WarcFileWriter and related classes.

JWAT-76: Fix for archiveLengthStr/contentLengthStr set and archiveLength/contentLength null when using payload length validation.

Removed alot of tags and replaced with spaces. (Company policy)

Minor code cleanup.

    • -15
    • +35
    ./arc/ArcFileNamingSingleFile.java
  1. … 91 more files in changeset.
JWAT-77: Add (W)ArcFileWriter helper classes.
    • -0
    • +26
    ./arc/ArcFileNaming.java
    • -0
    • +82
    ./arc/ArcFileNamingDefault.java
    • -0
    • +44
    ./arc/ArcFileNamingSingleFile.java
    • -0
    • +138
    ./arc/ArcFileWriter.java
    • -0
    • +52
    ./arc/ArcFileWriterConfig.java
  1. … 5 more files in changeset.
Fixed some texts. Added some spaces.
  1. … 39 more files in changeset.
Stuff with close() now implements Closeable
  1. … 8 more files in changeset.
Added some unit tests.
  1. … 6 more files in changeset.
JWAT-67: Validate support for sha256 Digest algorithm in WARC-header. Tested with sha1, sha-256, sha-512, tiger, RipeMD128, RipeMD160, RipeMD256, RipeMD320 and BouncyCastle.

Added reader and addHeader support for WARC-Refers-To-Target-URI and WARC-Refers-To-Date including minor tests.

  1. … 9 more files in changeset.
Work in progress on unified ARC/WARC reader. Module not included yet.
  1. … 6 more files in changeset.
Fix for JWAT-65 and JWAT-66.

JWAT-65: HttpHeader digest not calculated when using getInputStream on payload.

JWAT-66: ArcRecords with invalid Urls do not get their HttpHeader parsed.

  1. … 7 more files in changeset.
Followup from reviews.

Saving of test data for use in JHOVE2.

'no-type' is ignored when looking for http headers.

Improved detection of possible arc record.

Minor tweaks.

  1. … 37 more files in changeset.
Zero length ARC, WARC and GZip files are now reported as non compliant.
  1. … 15 more files in changeset.
Generation of ARC/WARC test files based on unittests, reorganizing of test files. Minor tweaks.
  1. … 68 more files in changeset.
Raw ARC record line now stored.

Removed some tabs.

  1. … 5 more files in changeset.
Strict validation of <> encapsulating some URIs.

Tying up loose ends.

  1. … 15 more files in changeset.
Added some getters.
  1. … 3 more files in changeset.
Minor refactoring of API, unittests, etc.
  1. … 9 more files in changeset.
Added an even more relaxed Uri profile for Heritrix written data.

Warc-Profile treated as an URI, oversight fixed (JWAT-61).

Minor review stuff.

Refactored Test classes file names.

  1. … 56 more files in changeset.
Uri methods added with profile parameter, additional uri profiles added, minor unittesting, javadocs and review changes.
  1. … 4 more files in changeset.
Followup to review CR-JWAS-25. Experimental Uri implementation.
  1. … 23 more files in changeset.
Somewhat conclusion of the following issues:

JWAT-46: ARC reader refactoring

JWAT-8: Unit tests and coverage of ARCRecordBase, ArcRecord and ArcVersionBlock

JWAT-45: ARC writer

Partial lenient Uri implementation.

  1. … 12 more files in changeset.
ArcWriter->ArcReader combo unit tested.
  1. … 11 more files in changeset.
Added the last validation checks in the Arc reader and added some unit tests for the new validation errors.
  1. … 5 more files in changeset.
startOffset tweaking and unit testing.

Unit testing of those hard to throw exceptions.

  1. … 18 more files in changeset.
Minor javadoc and complete unit testing of ArcRecord and ArcVersionBlock.
  1. … 7 more files in changeset.