jwat-warc

Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Added callback support to read and modify the ARC/WARC headers before processing the payload.
    • -0
    • +11
    ./src/main/java/org/jwat/warc/WarcReader.java
    • -7
    • +10
    ./src/main/java/org/jwat/warc/WarcRecord.java
    • -0
    • +24
    ./src/main/java/org/jwat/warc/WarcRecordParserCallback.java
  1. … 5 more files in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 7 more files in changeset.
[maven-release-plugin] prepare release jwat-1.1.0
  1. … 7 more files in changeset.
JWAT-89: Removed encodedwords use in HeaderLineParser. Both need to be refactord and it is not really useful.
  1. … 1 more file in changeset.
CR-JWAS-33: Follow-up on review.
    • -1
    • +1
    ./src/main/java/org/jwat/warc/WarcHeader.java
    • -3
    • +3
    ./src/main/java/org/jwat/warc/WarcReader.java
    • -4
    • +4
    ./src/main/java/org/jwat/warc/WarcRecord.java
    • -9
    • +9
    ./src/main/java/org/jwat/warc/WarcWriter.java
  1. … 59 more files in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 7 more files in changeset.
[maven-release-plugin] prepare release jwat-1.0.6
  1. … 7 more files in changeset.
JWAT-88: Unit test improved.
JWAT-88: Change so payload digest is not checked for WARC revisit and continuation records.
    • -1
    • +2
    ./src/main/java/org/jwat/warc/WarcRecord.java
    • -0
    • +279
    ./src/test/java/org/jwat/warc/TestWarc_RecordTypeDigestCheck.java
JWAT-87: Improved detection of garbage at the ed of (W)ARC files and unit tests of this. Also added unit tests testing empty (W)ARC files.
    • -2
    • +6
    ./src/main/java/org/jwat/warc/WarcRecord.java
    • -0
    • +43
    ./src/test/resources/invalid-warcfile-record-then-garbage.warc
  1. … 3 more files in changeset.
Made some changes so ArcHeader, WarcHeader and HttpHeader can be re-parsed without using the complete Arc/Warc Readers.
    • -33
    • +35
    ./src/main/java/org/jwat/warc/WarcHeader.java
    • -12
    • +7
    ./src/main/java/org/jwat/warc/WarcWriter.java
  1. … 2 more files in changeset.
Made some methods public instead of protected. Various cleanup.
    • -11
    • +11
    ./src/main/java/org/jwat/warc/WarcFieldParsers.java
    • -43
    • +44
    ./src/main/java/org/jwat/warc/WarcHeader.java
  1. … 1 more file in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 7 more files in changeset.
[maven-release-plugin] prepare release jwat-1.0.5
  1. … 7 more files in changeset.
ArcReader and WarcReader now implement Iterable<..> interface.
    • -6
    • +6
    ./src/main/java/org/jwat/warc/WarcReader.java
  1. … 2 more files in changeset.
Use default charset in case of bad charset and handle bad encoding in WARC-Target-URI header (add a simple test case)
    • -0
    • +6
    ./src/main/java/org/jwat/warc/WarcHeader.java
    • -0
    • +111
    ./src/test/java/org/jwat/warc/TestWarc_BadEncodingHeader.java
    • binary
    ./src/test/resources/invalid-warcfile-encoding-headers.warc.gz
  1. … 2 more files in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 7 more files in changeset.
[maven-release-plugin] prepare release jwat-1.0.4
  1. … 7 more files in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 7 more files in changeset.
[maven-release-plugin] prepare release jwat-1.0.3
  1. … 7 more files in changeset.
JWAT-77: Unit tests and bug fixes for newly implemented ArcFileWriter/WarcFileWriter and related classes.

JWAT-76: Fix for archiveLengthStr/contentLengthStr set and archiveLength/contentLength null when using payload length validation.

Removed alot of tags and replaced with spaces. (Company policy)

Minor code cleanup.

    • -62
    • +108
    ./src/main/java/org/jwat/warc/WarcFileWriter.java
    • -0
    • +7
    ./src/main/java/org/jwat/warc/WarcWriter.java
    • -0
    • +127
    ./src/test/java/org/jwat/warc/TestHelpers.java
    • -0
    • +71
    ./src/test/java/org/jwat/warc/TestWarcFileNamingDefault.java
    • -0
    • +505
    ./src/test/java/org/jwat/warc/TestWarcFileWriter.java
    • -0
    • +64
    ./src/test/java/org/jwat/warc/TestWarcFileWriterConfig.java
  1. … 82 more files in changeset.
ANVLRecord adds space after ":" to make output pretty.

Made constant in WarcFileWriter public.

  1. … 2 more files in changeset.
JWAT-77: Add (W)ArcFileWriter helper classes.
    • -0
    • +26
    ./src/main/java/org/jwat/warc/WarcFileNaming.java
    • -0
    • +82
    ./src/main/java/org/jwat/warc/WarcFileNamingDefault.java
    • -0
    • +44
    ./src/main/java/org/jwat/warc/WarcFileNamingSingleFile.java
    • -0
    • +177
    ./src/main/java/org/jwat/warc/WarcFileWriter.java
    • -0
    • +52
    ./src/main/java/org/jwat/warc/WarcFileWriterConfig.java
  1. … 5 more files in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 6 more files in changeset.
[maven-release-plugin] prepare release jwat-1.0.2
  1. … 6 more files in changeset.
Fixed some texts. Added some spaces.
    • -4
    • +4
    ./src/main/java/org/jwat/warc/WarcReader.java
    • -2
    • +2
    ./src/main/java/org/jwat/warc/WarcRecord.java
    • -1
    • +1
    ./src/main/java/org/jwat/warc/WarcWriter.java
  1. … 27 more files in changeset.
Stuff with close() now implements Closeable
    • -1
    • +2
    ./src/main/java/org/jwat/warc/WarcReader.java
    • -7
    • +8
    ./src/main/java/org/jwat/warc/WarcRecord.java
    • -6
    • +7
    ./src/main/java/org/jwat/warc/WarcWriter.java
  1. … 8 more files in changeset.
Fixed the javadoc so that the command 'mvn -Psonatype-oss-release clean install -Dgpg.skip=true' works
    • -0
    • +1
    ./src/main/java/org/jwat/warc/WarcDigest.java
    • -21
    • +19
    ./src/main/java/org/jwat/warc/WarcHeader.java
    • -6
    • +5
    ./src/main/java/org/jwat/warc/WarcReader.java
  1. … 6 more files in changeset.
[maven-release-plugin] prepare for next development iteration
  1. … 6 more files in changeset.
[maven-release-plugin] prepare release jwat-1.0.1
  1. … 6 more files in changeset.