Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For the failed job, check that the overrides are visible. This can be done by comparing the original setup/order files with the modified files. The setup/order files are part of the metadata warc files generated as part of the harvest. 

The easiest way to do this is from test@kb-prod-udv-001: (replacing <jobno> with the number of the original job, here 7)

Code Block
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep max-hops /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep delay-factor /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc

...

Note that there should be two setup/order reports. The one containing a timestamp in its name is the original order.xml, the one called simply
metadata://netarkivet.dk/crawl/setup/order.xml is the final modified version.)

Check that Alias Domains are not Harvested

...

Check that there was no Deduplication

CLARIFY:  Using Using a browser setup for ViewerProxy access, check the processors-report for one of the snapshot-harvest jobs:

  1. Goto the Job details page for the newly finished job by clicking the link in the JobID column.
  2. Click the Browse reports for jobs.

Confirm that there was no DeDuplicator report with a "Duplicate found" line.

Stop the Test and Clean-Up

...