Clone Tools
  • last updated a few seconds ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
CR-JWAS-33: Follow-up on review.
  1. … 69 more files in changeset.
JWAT-87: Improved detection of garbage at the ed of (W)ARC files and unit tests of this. Also added unit tests testing empty (W)ARC files.
  1. … 5 more files in changeset.
JWAT-77: Unit tests and bug fixes for newly implemented ArcFileWriter/WarcFileWriter and related classes.

JWAT-76: Fix for archiveLengthStr/contentLengthStr set and archiveLength/contentLength null when using payload length validation.

Removed alot of tags and replaced with spaces. (Company policy)

Minor code cleanup.

    • -0
    • +110
    ./TestArcFileNamingDefault.java
    • -0
    • +55
    ./TestArcFileNamingSingleFile.java
    • -0
    • +505
    ./TestArcFileWriter.java
    • -0
    • +64
    ./TestArcFileWriterConfig.java
    • -25
    • +11
    ./TestArcReaderUncompressed.java
    • -2
    • +2
    ./TestArcReader_NextAndIterRecord.java
  1. … 82 more files in changeset.
Fixed some texts. Added some spaces.
  1. … 39 more files in changeset.
JWAT-72: Scheme class is not case insensitive

Unit test of isArcRecord().

    • -0
    • +68
    ./TestArcReaderFactory_IsMagic.java
  1. … 1 more file in changeset.
JWAT-67: Validate support for sha256 Digest algorithm in WARC-header. Tested with sha1, sha-256, sha-512, tiger, RipeMD128, RipeMD160, RipeMD256, RipeMD320 and BouncyCastle.

Added reader and addHeader support for WARC-Refers-To-Target-URI and WARC-Refers-To-Date including minor tests.

    • -2
    • +2
    ./TestArcReader_NextAndIterRecord.java
  1. … 9 more files in changeset.
Fix for JWAT-65 and JWAT-66.

JWAT-65: HttpHeader digest not calculated when using getInputStream on payload.

JWAT-66: ArcRecords with invalid Urls do not get their HttpHeader parsed.

  1. … 6 more files in changeset.
Followup from reviews.

Saving of test data for use in JHOVE2.

'no-type' is ignored when looking for http headers.

Improved detection of possible arc record.

Minor tweaks.

  1. … 36 more files in changeset.
Zero length ARC, WARC and GZip files are now reported as non compliant.
  1. … 14 more files in changeset.
Generation of ARC/WARC test files based on unittests, reorganizing of test files. Minor tweaks.
    • -0
    • +176
    ./GenerateArcTestFiles.java
  1. … 63 more files in changeset.
Changed existing spaces to tabs even though personal preference is for tabs.
  1. … 4 more files in changeset.
Added an even more relaxed Uri profile for Heritrix written data.

Warc-Profile treated as an URI, oversight fixed (JWAT-61).

Minor review stuff.

Refactored Test classes file names.

    • -0
    • +42
    ./TestArc_UriProfile.java
  1. … 52 more files in changeset.
JWAT-59: Good progress on JWAt Uri implementation. Almost ready and tested.
  1. … 2 more files in changeset.
Followup to review CR-JWAS-25. Experimental Uri implementation.
  1. … 25 more files in changeset.
Somewhat conclusion of the following issues:

JWAT-46: ARC reader refactoring

JWAT-8: Unit tests and coverage of ARCRecordBase, ArcRecord and ArcVersionBlock

JWAT-45: ARC writer

Partial lenient Uri implementation.

    • -0
    • +510
    ./TestArcWriter_ArchiveLength.java
    • -0
    • +337
    ./TestArcWriter_States.java
  1. … 12 more files in changeset.
ArcWriter->ArcReader combo unit tested.
  1. … 8 more files in changeset.
Added the last validation checks in the Arc reader and added some unit tests for the new validation errors.
  1. … 4 more files in changeset.
startOffset tweaking and unit testing.

Unit testing of those hard to throw exceptions.

    • -67
    • +110
    ./TestArcReader_NextAndIterRecord.java
  1. … 19 more files in changeset.
Minor javadoc and complete unit testing of ArcRecord and ArcVersionBlock.
  1. … 6 more files in changeset.
A bit more unit testing.
    • -1
    • +70
    ./TestArcReaderFactory_IsMagic.java
    • -0
    • +285
    ./TestParams_Writer.java
  1. … 5 more files in changeset.
Followup on reviews. (CR-JWAS-19, CR-JWAS-20, CR-JWAS-21, CR-JWAS-22, CR-JWAS-23, CR-JWAS-24)
  1. … 44 more files in changeset.
JWAT-60: no-type was not correctly handled in ARCHeader rewrite.

JWAT-58: ARCReader now defaults to a less strict mode where LFs in otherwise empty version block is allowed and trailing LFs between records is ignore if not =1.

JWAT-57: Should have been fixed in previous push.

Also added check for negative offset and archive-length.

Rewrote the record reader to not accept any line as a possible record header. Adds error of data before record if it occurs.

Added some unit testing here and there.

  1. … 7 more files in changeset.
Fixed a bug introduced with http request support in HttpHeader parser.

Added some unit tests for absolute resources in http request.

Added some more unit testing here and there.

Changed the ARC Writer slightly, still not 100% functional nor tested.

  1. … 13 more files in changeset.
Better handling of "-" in ARC header values.
    • -0
    • +93
    ./TestArcRecord.java
  1. … 4 more files in changeset.
ArchiveLengthStr was not exposed, but needed in JHove2 Module.
  1. … 1 more file in changeset.
Found a bug in the ARC reader, not all diagnoses were reported even though the isCompliant field was correctly updated.
  1. … 3 more files in changeset.
Improved handling of versionblock and metadata, added hasEmptyPayload if the payload has been completely processed by the reader.

Added some more errors/warnings that needed to be refactored.

Unit test of ArcVersionBlock.

    • -0
    • +637
    ./TestArcVersionBlock.java
  1. … 5 more files in changeset.
API changes for JHove2 modules.
  1. … 6 more files in changeset.
More unit testing and minor refactorting...
    • -655
    • +0
    ./TestArcReaderFactoryCompressed.java
    • -24
    • +719
    ./TestArcReaderCompressed.java
    • -647
    • +0
    ./TestArcReaderFactoryUncompressed.java
    • -22
    • +693
    ./TestArcReaderUncompressed.java
    • -2
    • +226
    ./TestArcReader_NextAndIterRecord.java
    • -0
    • +124
    ./TestArcRecordBase.java
  1. … 18 more files in changeset.
More unit testing, consumed now works for ARC/WARC sequential and random methods. Other minor refactoring.
    • -17
    • +262
    ./TestArcReaderFactoryCompressed.java
    • -22
    • +254
    ./TestArcReaderFactoryUncompressed.java
  1. … 20 more files in changeset.