Page tree
Skip to end of metadata
Go to start of metadata

As part of the development process we've created a new test-case that allows us to test all access/processing on either one replica or the other. The two cases are specified by the two deployment configurations default_deploy_config_bitmagaar.xml and default_deploy_config_bitmagkbh.xml which can be found in the Danish netarkivet stash repository. These can be installed on the test system using the standard procedure 

export TESTX=Test<x> (one of our pre-assigned CollectionIDs in bitmag)
export PORT=807?
export CONF=default_deploy_config_bitmag<xxx>.xml

The configurations are such that there is only one BitarchiveMonitorApplication in each of the two configurations, so there is no possibility of running batch jobs by accident on the wrong replica. A good suite of tests for each configuration is:

  1. Run a small (10MB) selective harvest
  2. Check that the files are uploaded to the correct collection in bitmag
  3. Check that the "Browse harvest files of job" link for the harvest shows exactly two files - one metadata, one data
  4. Check that "Browse reports for jobs" shows links to all the the metadata records
  5. With the standard ViewerProxy setup, check that the metadata record links work
  6. Click on "Select this job for QA with viewerproxy " and check that you can actually browse in the harvested data
  7. Look at all the BitarchiveApplication instances in the Systemstate overview. The "opposite" replica, the one that you are not testing, should only show messages about sending heartbeats or sleeping until trying again. [There is a slight, but non-blocking, issue here that heartbeats get sent to a queue with no listener and this eventually gets full and stops receiving any more.]
  8. Rerun the same harvest
  9. Check the processors-report.txt metadata record. the DeDuplicator report should show non-zero "Duplicates found"

With this done, you have checked:

  • File-listing via bitarkiv
  • Browsing-access via bitarkiv
  • Batch-indexing for Viewerproxy
  • Batch-indexing for deduplication

for the replica under test.

Adding wayback to the mix is, as usual, fiddly because you also have to fix the replica being used in the NAS settings file that gets passed to the wayback webapp under tomcat.. However you can easily check wayback indexing by browsing in the indexDir directory in the test installation directory on (It can be up to 20 minutes behind the harvesting.) You should easily be able to find cdx entries corresponding to the domains you harvested.

  • No labels