[NAS-2518] Upgrade to latests H3 from Iceland Created: 28/Apr/16  Updated: 04/Oct/16  Resolved: 27/Sep/16

Status: Resolved
Project: NetarchiveSuite
Component/s: Heritrix 3
Affects Version/s: None
Fix Version/s: 5.2

Type: Bug Priority: Major
Reporter: Colin Rosenthal Assignee: Nicholas Clarke (Inactive)
Resolution: Fixed  
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Sprint: NAS 5.2

 Description   

Upgrade to Kristinn's 2015 release og H3. This task should include adding some more detailed harvesting checks to integration test - something like harvesting netarkivet.dk and checking that a sensible number of bytes/documents are retrieved.



 Comments   
Comment by Nicholas Clarke (Inactive) [ 28/Sep/16 ]

To be honest I think the version string Is hardcoded.
Even though the version is returned by the H3 REST API.
So this should be fixed at some point soon if it is the case.
We though of it before, but forgot It last time.

Comment by Sara Aubry [ 28/Sep/16 ]

While testing, we noticed a minor bug in the WARC-Target-URIs of the WARC metadata file.
Heritrix version is correction but not the date. Should be Heritrix 3.3.0-LBS-2016-02 in stead of 3.3.0-LBS-2014-03

WARC-Target-URI: metadata://netarchivesuite.bnf.fr/crawl/reports/crawl-report.txt?heritrixVersion=3.3.0-LBS-2014-03&harvestid=2&jobid=47
WARC-Target-URI: metadata://netarchivesuite.bnf.fr/crawl/reports/processors-report.txt?heritrixVersion=3.3.0-LBS-2014-03&harvestid=2&jobid=47

Comment by Colin Rosenthal [ 20/Sep/16 ]

Isn't this done, tested, and in production ages ago?

Comment by Nicholas Clarke (Inactive) [ 03/Jun/16 ]

Heritrix3->3.3.0-LBS-2016-02 uploaded to sb-nexus.
Version also changed in NAS master branch.

Comment by Colin Rosenthal [ 28/Apr/16 ]

This will now be the 2016 release - http://kris-sigur.blogspot.dk/2016/04/new-semi-stable-build-for-heritrix.html?spref=tw

We will need to start by making our own snapshot version of Kris' release available in a maven repository (ie nexus).

Generated at Fri Apr 26 15:54:30 CEST 2024 using Jira 9.4.15#940015-sha1:bdaa9cbecfb6791ea579749728cab771f0dfe90b.