Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
JWAT-89: Removed encodedwords use in HeaderLineParser. Both need to be refactord and it is not really useful.
    • -2
    • +2
    ./org/jwat/warc/TestWarc_BadEncodingHeader.java
  1. … 1 more file in changeset.
CR-JWAS-33: Follow-up on review.
    • -3
    • +3
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcFileWriter.java
    • -3
    • +3
    ./org/jwat/warc/TestWarcReaderCompressed.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcReaderFactory.java
    • -3
    • +3
    ./org/jwat/warc/TestWarcReaderUncompressed.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcRecordIerator.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcWriterFactory.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_BadEncodingHeader.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_DigestFields.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_DuplicateFields.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_MissingHeadersAll.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_NonWarcHeaders.java
  1. … 59 more files in changeset.
JWAT-88: Unit test improved.
    • -15
    • +145
    ./org/jwat/warc/TestWarc_RecordTypeDigestCheck.java
JWAT-88: Change so payload digest is not checked for WARC revisit and continuation records.
    • -0
    • +279
    ./org/jwat/warc/TestWarc_RecordTypeDigestCheck.java
  1. … 1 more file in changeset.
JWAT-87: Improved detection of garbage at the ed of (W)ARC files and unit tests of this. Also added unit tests testing empty (W)ARC files.
    • -17
    • +87
    ./org/jwat/warc/TestWarcReader_Diagnosis.java
  1. … 5 more files in changeset.
Made some changes so ArcHeader, WarcHeader and HttpHeader can be re-parsed without using the complete Arc/Warc Readers.
    • -1
    • +0
    ./org/jwat/warc/TestWarcHeaderHelper.java
  1. … 4 more files in changeset.
ArcReader and WarcReader now implement Iterable<..> interface.
    • -1
    • +2
    ./org/jwat/warc/TestWarcReaders_Params.java
  1. … 3 more files in changeset.
Use default charset in case of bad charset and handle bad encoding in WARC-Target-URI header (add a simple test case)
    • -0
    • +111
    ./org/jwat/warc/TestWarc_BadEncodingHeader.java
  1. … 4 more files in changeset.
JWAT-77: Unit tests and bug fixes for newly implemented ArcFileWriter/WarcFileWriter and related classes.

JWAT-76: Fix for archiveLengthStr/contentLengthStr set and archiveLength/contentLength null when using payload length validation.

Removed alot of tags and replaced with spaces. (Company policy)

Minor code cleanup.

    • -2
    • +0
    ./org/jwat/warc/SaveWarcTestFiles.java
    • -0
    • +127
    ./org/jwat/warc/TestHelpers.java
    • -3
    • +3
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -0
    • +71
    ./org/jwat/warc/TestWarcFileNamingDefault.java
    • -0
    • +55
    ./org/jwat/warc/TestWarcFileNamingSingleFile.java
    • -0
    • +505
    ./org/jwat/warc/TestWarcFileWriter.java
    • -0
    • +64
    ./org/jwat/warc/TestWarcFileWriterConfig.java
    • -2
    • +1
    ./org/jwat/warc/TestWarcHeaderAddDatatyped.java
    • -2
    • +1
    ./org/jwat/warc/TestWarcHeaderFieldPolicy.java
    • -2
    • +1
    ./org/jwat/warc/TestWarcHeaderHelper.java
    • -2
    • +1
    ./org/jwat/warc/TestWarcHeaderVersion.java
    • -45
    • +26
    ./org/jwat/warc/TestWarcReaderCompressed.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcReaderFactory.java
  1. … 82 more files in changeset.
Fixed some texts. Added some spaces.
    • -3
    • +3
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -3
    • +3
    ./org/jwat/warc/TestWarcReaderCompressed.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcReaderFactory.java
    • -3
    • +3
    ./org/jwat/warc/TestWarcReaderUncompressed.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcRecordIerator.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcWriterFactory.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_DigestFields.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_DuplicateFields.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_MissingHeadersAll.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_NonWarcHeaders.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_SegmentNumber.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_UpperLowerCase.java
  1. … 27 more files in changeset.
JWAT-69: Unit tested WARC-Refers-To-Target-URI and WARC-Refers-To-Date in writer.

Bug fixed some copy/paste errors in the two new headers.

    • -0
    • +16
    ./org/jwat/warc/TestWarcWriter_Headers.java
  1. … 1 more file in changeset.
JWAT-69: Unit tested WARC-Refers-To-Target-URI and WARC-Refers-To-Date in reader.

Fixed some small bugs and omissions with the reading of those new headers.

Removed some tabs.

    • -0
    • +42
    ./org/jwat/warc/TestWarcConstants.java
    • -10
    • +52
    ./org/jwat/warc/TestWarcHeader.java
    • -1
    • +11
    ./org/jwat/warc/TestWarcHeaderAddDatatyped.java
    • -0
    • +6
    ./org/jwat/warc/TestWarcHeaderFieldPolicy.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcHeaderHelper.java
  1. … 7 more files in changeset.
JWAT-70: Unit test DataFormatException in Gzip reader/writer.

JWAT-71: Found IndexOutOfBoundException while unit testing DataFormatException in GzipWriter.

Removed some tabs.

    • -22
    • +22
    ./org/jwat/warc/TestWarcRecordDigests.java
  1. … 7 more files in changeset.
Improved Gzip reader getConsumed() and getOffset() unit testing.
    • -1
    • +0
    ./org/jwat/warc/TestWarcRecordDigests.java
  1. … 2 more files in changeset.
JWAT-67: Validate support for sha256 Digest algorithm in WARC-header. Tested with sha1, sha-256, sha-512, tiger, RipeMD128, RipeMD160, RipeMD256, RipeMD320 and BouncyCastle.

Added reader and addHeader support for WARC-Refers-To-Target-URI and WARC-Refers-To-Date including minor tests.

    • -1
    • +14
    ./org/jwat/warc/TestWarcHeader.java
    • -0
    • +97
    ./org/jwat/warc/TestWarcRecordDigests.java
  1. … 9 more files in changeset.
Followup from reviews.

Saving of test data for use in JHOVE2.

'no-type' is ignored when looking for http headers.

Improved detection of possible arc record.

Minor tweaks.

    • -91
    • +0
    ./org/jwat/warc/GenerateWarcTestFiles.java
    • -1
    • +99
    ./org/jwat/warc/SaveWarcTestFiles.java
    • -1
    • +0
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcHeaderFieldPolicy.java
    • -6
    • +6
    ./org/jwat/warc/TestWarcHeaderVersion.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcReader_Diagnosis.java
    • -5
    • +5
    ./org/jwat/warc/TestWarcRecordDigests.java
    • -4
    • +4
    ./org/jwat/warc/TestWarc_UriProfile.java
  1. … 33 more files in changeset.
Zero length ARC, WARC and GZip files are now reported as non compliant.
    • -3
    • +3
    ./org/jwat/warc/GenerateWarcTestFiles.java
    • -1
    • +1
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -0
    • +26
    ./org/jwat/warc/TestWarcReader.java
    • -1
    • +6
    ./org/jwat/warc/TestWarcReader_Diagnosis.java
  1. … 12 more files in changeset.
Generation of ARC/WARC test files based on unittests, reorganizing of test files. Minor tweaks.
    • -0
    • +91
    ./org/jwat/warc/GenerateWarcTestFiles.java
    • -2
    • +54
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -17
    • +38
    ./org/jwat/warc/TestWarcHeaderFieldPolicy.java
    • -0
    • +9
    ./org/jwat/warc/TestWarcHeaderVersion.java
    • -0
    • +3
    ./org/jwat/warc/TestWarcReader_Diagnosis.java
    • -0
    • +16
    ./org/jwat/warc/TestWarcRecordDigests.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_DigestFields.java
    • -2
    • +2
    ./org/jwat/warc/TestWarc_DuplicateFields.java
    • -5
    • +5
    ./org/jwat/warc/TestWarc_MissingHeadersAll.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_NonWarcHeaders.java
    • -2
    • +2
    ./org/jwat/warc/TestWarc_SegmentNumber.java
    • -1
    • +1
    ./org/jwat/warc/TestWarc_UpperLowerCase.java
  1. … 54 more files in changeset.
Raw ARC record line now stored.

Removed some tabs.

    • -12
    • +12
    ./org/jwat/warc/TestWarcFieldParsers.java
  1. … 5 more files in changeset.
Strict validation of <> encapsulating some URIs.

Tying up loose ends.

    • -30
    • +154
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -8
    • +18
    ./org/jwat/warc/TestWarcHeaderAddDatatyped.java
  1. … 12 more files in changeset.
Minor refactoring of API, unittests, etc.
  1. … 9 more files in changeset.
Changed existing spaces to tabs even though personal preference is for tabs.
    • -22
    • +22
    ./org/jwat/warc/TestWarcHeader.java
    • -5
    • +5
    ./org/jwat/warc/TestWarcHeaderFieldPolicy.java
    • -46
    • +46
    ./org/jwat/warc/TestWarc_UriProfile.java
  1. … 2 more files in changeset.
Added an even more relaxed Uri profile for Heritrix written data.

Warc-Profile treated as an URI, oversight fixed (JWAT-61).

Minor review stuff.

Refactored Test classes file names.

    • -181
    • +0
    ./org/jwat/warc/TestFieldParsers.java
    • -8
    • +195
    ./org/jwat/warc/TestWarcFieldParsers.java
    • -71
    • +122
    ./org/jwat/warc/TestWarcHeader.java
    • -1
    • +6
    ./org/jwat/warc/TestWarcHeaderFieldPolicy.java
    • -0
    • +3
    ./org/jwat/warc/TestWarcHeaderHelper.java
    • -74
    • +0
    ./org/jwat/warc/TestWarcReaderDiagnosis.java
    • -1
    • +75
    ./org/jwat/warc/TestWarcReader_Diagnosis.java
    • -387
    • +0
    ./org/jwat/warc/TestParams_Readers.java
    • -1
    • +388
    ./org/jwat/warc/TestWarcReaders_Params.java
    • -376
    • +0
    ./org/jwat/warc/TestWarcWriterHeaders.java
    • -1
    • +377
    ./org/jwat/warc/TestWarcWriter_Headers.java
    • -252
    • +0
    ./org/jwat/warc/TestWarcWriterPayload.java
    • -2
    • +254
    ./org/jwat/warc/TestWarcWriter_Payload.java
    • -303
    • +0
    ./org/jwat/warc/TestParams_Writers.java
    • -1
    • +304
    ./org/jwat/warc/TestWarcWriters_Params.java
  1. … 44 more files in changeset.
Followup to review CR-JWAS-25. Experimental Uri implementation.
    • -13
    • +12
    ./org/jwat/warc/TestWarcConcurrentTo.java
    • -12
    • +12
    ./org/jwat/warc/TestWarcHeaderAddDatatyped.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcReaderCompressed.java
    • -2
    • +2
    ./org/jwat/warc/TestWarcReaderUncompressed.java
    • -2
    • +0
    ./org/jwat/warc/TestWarcWriterFactory.java
    • -12
    • +12
    ./org/jwat/warc/TestWarcWriterHeaders.java
  1. … 25 more files in changeset.
Somewhat conclusion of the following issues:

JWAT-46: ARC reader refactoring

JWAT-8: Unit tests and coverage of ARCRecordBase, ArcRecord and ArcVersionBlock

JWAT-45: ARC writer

Partial lenient Uri implementation.

    • -8
    • +27
    ./org/jwat/warc/TestParams_Writers.java
    • -4
    • +1
    ./org/jwat/warc/TestWarcWriterHeaders.java
    • -14
    • +14
    ./org/jwat/warc/TestWarcWriter_States.java
  1. … 12 more files in changeset.
ArcWriter->ArcReader combo unit tested.
    • -0
    • +26
    ./org/jwat/warc/TestWarcReader.java
    • -336
    • +0
    ./org/jwat/warc/TestWarcWriter.java
    • -1
    • +337
    ./org/jwat/warc/TestWarcWriter_States.java
  1. … 10 more files in changeset.
startOffset tweaking and unit testing.

Unit testing of those hard to throw exceptions.

    • -0
    • +86
    ./org/jwat/warc/TestWarcReader.java
    • -0
    • +60
    ./org/jwat/warc/TestWarcReaderCompressed.java
    • -0
    • +52
    ./org/jwat/warc/TestWarcReaderUncompressed.java
    • -0
    • +53
    ./org/jwat/warc/TestWarcRecordIerator.java
  1. … 19 more files in changeset.
A bit more unit testing.
    • -0
    • +387
    ./org/jwat/warc/TestParams_Readers.java
    • -0
    • +284
    ./org/jwat/warc/TestParams_Writers.java
  1. … 8 more files in changeset.
Followup on reviews. (CR-JWAS-19, CR-JWAS-20, CR-JWAS-21, CR-JWAS-22, CR-JWAS-23, CR-JWAS-24)
    • -1
    • +1
    ./org/jwat/warc/TestWarcWriterFactory.java
  1. … 42 more files in changeset.
Fixed a bug introduced with http request support in HttpHeader parser.

Added some unit tests for absolute resources in http request.

Added some more unit testing here and there.

Changed the ARC Writer slightly, still not 100% functional nor tested.

    • -0
    • +3
    ./org/jwat/warc/TestWarcConcurrentTo.java
  1. … 14 more files in changeset.