Check that the modularisation of the software allows for separate installation of the components.

 

 

Goals

Prerequisites

needs the following deploy configurations: deploy_config_arcrepository.xml, deploy_config_harvester.xml, deploy_config_viewerproxy.xml

 

Procedure

Start a Pure ArcRepository (default config with the harvester-apps deleted)

On devel@kb-prod-udv-001.kb.dk:

export TESTX=TEST9
export PORT=807?
export MAILRECEIVERS=foo@bar.dk
prepare_test.sh deploy_config_arcrepository.xml
install_test.sh
start_test.sh 

 

Check that the ArcRepository is Running

Go into the GUI at http://kb-test-adm-001.kb.dk:$PORT/BitPreservation . There should only be two site sections in the leftmost menu.

Click on "Bitpreservation" and update all flilelists and checksums. There should be no errors.

Click "System State" and check that there are no warnings or errors.

Upload a File to the ArcRepository

On devel@kb-test-adm-001:

export TESTX=TEST9
scp devel@kb-prod-udv-001.kb.dk:bitarchive_testdata/arcfiles/1-1-20130110140720-00000-kb-test-har-001.kb.dk.arc /tmp
cd $HOME/$TESTX
export CLASSPATH=$CLASSPATH:$HOME/$TESTX/lib/netarchivesuite-archive-core.jar
export CLASSPATH=$CLASSPATH:$HOME/$TESTX/lib/netarchivesuite-monitor-core.jar
java -Ddk.netarkivet.settings.file=$HOME/$TESTX/conf/settings_ArcRepositoryApplication.xml -Dsettings.common.applicationInstanceId=upload dk.netarkivet.archive.tools.Upload /tmp/1-1-20130110140720-00000-kb-test-har-001.kb.dk.arc 

Run a Batch Job

[This step is deprecated - csr] Download a Record from the ArcRepository

On kb-test-adm-001:

export TESTX=TEST9
mkdir /tmp/$TESTX 
cd /tmp/$TESTX
scp -r test@kb-prod-udv-001.kb.dk:test-data/1-cache.tar.bz2 .
mkdir test-index
cd test-index
tar -xjf ../1-cache.tar.bz2
cd $HOME/$TESTX
java -Ddk.netarkivet.settings.file=$HOME/$TESTX/conf/settings_ArcRepositoryApplication.xml -Dsettings.common.applicationInstanceId=record dk.netarkivet.archive.tools.GetRecord /tmp/$TESTX/test-index http://www.pligtaflevering.dk/online/vejledning.pdf > x.pdf

Check that the file x.pdf contains an http header followed by a pdf.

Cleanup the ArcRepository

On test@kb-prod-udv-001.kb.dk:

cleanup_all_test.sh

Install a Pure Harvester

On kb-prod-udv-001.kb.dk

prepare_test.sh deploy_config_harvester.xml
install_test.sh
start_test.sh

Check that the web GUI shows sections for Harvest Definition, History, and Status and that the Status section shows no obvious errors.

Create a Pseudo-Cache for Job 1

(Why?)

On netarkiv@sb-test-har-001:

export TESTX=TEST9
scp test@kb-prod-udv-001.kb.dk:test-data/1-cache.tar.bz2 /tmp
cd $HOME/$TESTX/
mkdir -p cache
cd cache
mkdir -p TrivialJobIndexCache
cd TrivialJobIndexCache

# extract 1-cache.tar.bz2 into dummy-cache
mkdir -p dummy-cache
cd dummy-cache
tar -xjf /tmp/1-cache.tar.bz2
cd $HOME/$TESTX/cache/TrivialJobIndexCache

# make symbolic links to the dummy index
# adds the possibility for running more than one job  
export ID="empty 1 2 3 4 "

for num in $ID
do
  ln -vs dummy-cache $num-DEDUP_CRAWL_LOG-cache
done

Perform a Harvest

Define and perform a harvest of netarkivet.dk with a 5MB limit.

Check the Harvest

There should be at least two warcfiles in netarkdv@sb-test-har-001.statsbiblioteket.dk:/home/netarkiv/$TESTX/localarchive .

Shutdown the Test

On kb-prod-udv-001:

cleanup_all_test.sh

Start a Pure Viewerproxy

On kb-prod-udv-001:

prepare_test.sh deploy_config_viewerproxy.xml
install_test.sh
start_test.sh

Ignore errors during the database-setup process. The pure ViewerProxy doesn't use or need a database.

Add Some Date to the Viewerproxy

On kb-test-acs-001:

export TESTX=TEST9
mkdir -p $HOME/$TESTX/cache/TrivialJobIndexCache/1-FULL_CRAWL_LOG-cache/
scp kb-prod-udv-001.kb.dk:test-data/1-cache.tar.bz2 $HOME/$TESTX/cache/TrivialJobIndexCache/1-FULL_CRAWL_LOG-cache/
cd $HOME/$TESTX/cache/TrivialJobIndexCache/1-FULL_CRAWL_LOG-cache/
tar -xjf 1-cache.tar.bz2
cd $HOME/$TESTX
mkdir -p localarchive/
cd localarchive/
scp kb-prod-udv-001.kb.dk:test-data/1-1-* .

Test that the Viewerproxy Works

On kb-prod-udv-001:

wget --execute "http_proxy = kb-test-acs-001.kb.dk:$PORT" "http://netarchivesuite.viewerproxy.invalid/changeIndex?jobID=1&label=dummy&returnURL=http://localhost"

(Ignore the 404 error)

wget --execute "http_proxy = kb-test-acs-001.kb.dk:$PORT" "http://www.netarkivet.dk"

Check that this downloads a file which shows the frontpage of netarkivet.dk

Close Down the Test

cleanup_all_test.sh