Nicholas Clarke

Fixed bug in crawllog caching. Work on optimizing frontier queue viewery. Removed useless auto refresh from most pages.

Merge branch 'master' into NAS-2638

NAS-2638: Handle script errors in Scripting/FrontierQueue by showing stacktrace on failure. SLF4J/Logback currently not included in the assembled webapp.

NAS-2638: Merge and fix of NAS-2642. Edited some buttons and removed tomcat articats from WEB-INF/lib directory.

NAS-2638: Moved menu generation to template builder wrapper. Improved config by catching exception for invalid regexes.

WEBDAN-269: Tweaked the menu.

NAS-2638: Tweaked zip artifact directory structure.

NAS-2638: Had to change the dependencies to include more since everything is mixed together.

NAS-2638: Also builds a h3 monitor artifact that uses tomcat embedded. H3 monitor resources split into separate class files. HTMLUtils modified to not be dependent on JSPOutputStream.

  1. … 8 more files in changeset.
Builds an artifact that uses tomcat embedded. Resources split into separate class files. HTMLUtils modified to not be dependent on JSPOutputStream.

  1. … 8 more files in changeset.
Added and fixed drop in hbase-phoenix ddl.

NAS-2638: Moved H3 remote monitor to separate module.

  1. … 44 more files in changeset.
NAS-2638: Moved H3 monitor classes to separate module.

  1. … 17 more files in changeset.
NAS-2641: Pagination now supports additional parameters in the page links.

Merge branch 'master' into staging

H3 monitor review follow-up and cleanup.

Improve support for empty files and errors at the end of (W)ARC files in the ArchiveParser/ArchiveParserCallback.
JWAT-89: Removed encodedwords use in HeaderLineParser. Both need to be refactord and it is not really useful.
CR-JWAS-33: Follow-up on review.
  1. … 59 more files in changeset.
JWAT-88: Unit test improved.
JWAT-88: Change so payload digest is not checked for WARC revisit and continuation records.
JWAT-87: Improved detection of garbage at the ed of (W)ARC files and unit tests of this. Also added unit tests testing empty (W)ARC files.
    • -0
    • +60
    /jwat-arc/src/test/resources/invalid-arcfile-record-then-garbage.arc
    • -0
    • +43
    /jwat-warc/src/test/resources/invalid-warcfile-record-then-garbage.warc
NAS-2610: Review followup.

Forgot to remove some dependencies to ArcHeader in the parser.
NAS-2610: Forgot to add a null check.

Merge branch 'master' into NAS-2610

Made some changes so ArcHeader, WarcHeader and HttpHeader can be re-parsed without using the complete Arc/Warc Readers.
Made some methods public instead of protected. Various cleanup.
Simple progress output to loadseeds.