For the failed job, check that the overrides are visible. This can be done by comparing the original setup/order files with the modified files. The setup/order files are part of the metadata warc files generated as part of the harvest.
The easiest way to do this is from test@kb-prod-udv-001: (replacing <jobno> with the number of the original job, here 7)
[test@kb-prod-udv-001 ~]$ ssh email@example.com grep max-hops /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc [test@kb-prod-udv-001 ~]$ ssh firstname.lastname@example.org grep delay-factor /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc
Note that there should be two setup/order reports. The one containing a timestamp in its name is the original order.xml, the one called simply
metadata://netarkivet.dk/crawl/setup/order.xml is the final modified version.)
Check that Alias Domains are not Harvested
Check that there was no Deduplication
CLARIFY: Using Using a browser setup for ViewerProxy access, check the processors-report for one of the snapshot-harvest jobs:
- Goto the Job details page for the newly finished job by clicking the link in the JobID column.
- Click the
Browse reports for jobs.
Confirm that there was no DeDuplicator report with a "Duplicate found" line.
Stop the Test and Clean-Up