Child pages
  • Installation of the Quickstart system

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

We have prepared a bash shell script that starts all the necessary components on one machine. We will use this script throughout this quickstart manual to allow you to get a feel for what the system can do and how it works without having to deal with issues of distributing to other servers.

Table of Contents

Base system required

For the quick startup, NetarchiveSuite requires:

  • A Linux system with a minimum of 2GB free diskspace.

...

To check that you have the right version of Java do the following

  • start a terminal login to the linux system as a ordinary user
  • check java version is version 1.6.0_19 (or higher) by writing:
    Code Block
    $ java -version
    you should then see something like
    Code Block
    
    linux>java -version
    java version "1.6.0_19"
    Java(TM) SE Runtime Environment (build 1.6.0_19-b04)
    Java HotSpot(TM) Server VM (build 16.2-b04, mixed mode)
    

Downloading

Download of the newest release is described here

  • Create a directory for the download e.g. directory
    Code Block
     $ mkdir netarchive 
  • Download the NetarchiveSuite zip and put it in netarchive directory you created earlier.

Note: Instead of downloading a NetarchiveSuite.zip you can also build it yourself from the svn trunk:

Code Block
$ svn export https://sbforge.org/svn/netarchivesuite/trunk .
$ cd trunk
$ ant releasezipball
$ mv NetarchiveSuite.zip ../NetarchiveSuite.zip

Setup JMS

NetarchiveSuite uses JMS for inter-process communication. JMS is the Java Messaging Service, which provides asynchronous communication between processes. You do not need any knowledge of JMS to use NetarchiveSuite. However you need to make sure that there are not already JMS brokers running on your system using PORT 7676.

Currently only the open-source version of Sun's JMS implementation is supported, since some functionality of other implementations does not match our assumptions well.

To download and install it, do the following:

  • Open this link in a browser window http://mq.dev.java.net/downloads.html
  • Click the Linux Link under version 4.4 Binary Downloads to download a file openmq4_4-installer-Linux_X86.zip (or later version)
  • Save the download file to the netarchive directory you created earlier 
  • Goto the directory
    Code Block
     $ cd ~/netarchive 
  • Unpack the zip file (this creates a directory openmq4_4-installer), and run the X-Windows installer. The installer ask you to choose an install-home (choose netarchive/MessageQueue), and a JDK.
    Code Block
    
    $ unzip openmq4_4-installer-Linux_X86.zip
    $ cd openmq4_4-installer; ./installer
  • Set necessary environment variables: IMQ_HOME, IMQ_VARHOME, IMQ_ETCHOME)
    Code Block
    $ export IMQ_HOME=$HOME/netarchive/MessageQueue/mq
    $ export IMQ_VARHOME=$HOME/netarchive/MessageQueue/mq/var
    $ export IMQ_ETCHOME=$HOME/netarchive/MessageQueue/mq/etc
    
  • Run imqbroker in order to create settings file
    Code Block
    $ chmod +x ./MessageQueue/mq/bin/imqbrokerd
    $./MessageQueue/mq/bin/imqbrokerd 
  • check that
    Code Block
     imqbrokerd 
    starts and that the last message is
    Code Block
     "Broker <localhost>:7676 ready" 
  • stop the imqbroker by pressing
    Code Block
     control-C 
  • edit settings to allow for enough listeners to a queue by doing
    edit
    Code Block
    
    ~/netarchive/MessageQueue/var/mq/instances/imqbroker/props/config.properties
    
  • uncomment and specify count=20 for listeners by changing line
    Code Block
    #            imq.autocreate.queue.maxNumActiveConsumers
    to
    Code Block
                imq.autocreate.queue.maxNumActiveConsumers=20

To start it, do the following:

Code Block

$ cd netarchive
$./MessageQueue/mq/bin/imqbrokerd &

Installation

Download following files to the netarchive directory:

RunNetarchiveSuite.sh

deploy_standalone_example.xml

  •  The QuickStart guide was tested with Ubuntu.
  • Sun/Oracle Java 8 SE runtime (or higher). Other Java versions such as OpenJDK are not tested or recommended. 
  • An additional user is named "test". The commands to install NetarchiveSuite are run from your own login. The commands install and run the NetarchiveSuite software under user "test". This simulates the more realistic productions situation where the software runs under various logins on one or more machines in a distributed network. For convenience, it is a good idea to configure the test-user to have password-free ssh access - i.e. you should be able to execute "ssh test@localhost" in a shell without entering the test-user's password.

Setup JMS

NetarchiveSuite uses Java Messaging Service (JMS) for communication between the different components. (The implementation used is OpenMQ 5.1)

Download the openmq installation script:

Code Block
wget https://raw.githubusercontent.com/netarchivesuite/netarchivesuite/master/deploy/deploy-core/scripts/openmq/mq.sh
chmod +x mq.sh

Install the openmq broker:

Code Block
./mq.sh install

This will download openmq, install and start it. 

 

Note: OpenMQ will be installed to ~/MessageQueue5.1.

Download NetarchiveSuite

Binary releases are available from https://sbforge.org/downloads/netarchivesuite/releases/.


Create a working directory and navigate to it

Code Block
mkdir netarchive; cd netarchive


Download the latest release

Code Block
wget -N -O NetarchiveSuite.zip https://sbforge.org/downloads/netarchivesuite/releases/stable/5.2/NetarchiveSuite-5.2.zip

together with the latest bundled harvester

Code Block
wget -N -O NetarchiveSuite-heritrix3-bundler.zip https://sbforge.org/downloads/netarchivesuite/releases/stable/5.2/NetarchiveSuite-heritrix3-bundler-5.2.zip		

Download RunNetarchiveSuite.sh and deploy_standalone_example.xml to the netarchive directory:

Code Block
wget -N https://raw.githubusercontent.com/netarchivesuite/netarchivesuite/master/deploy/distribution/src/main/resources/examples/deploy_standalone_example.xml
wget -N https://raw.githubusercontent.com/netarchivesuite/netarchivesuite/master/deploy/deploy-core/scripts/RunNetarchiveSuite.sh
chmod +x RunNetarchiveSuite.sh

The first script is a simple script for doing all the steps during deployment. It takes a NetarchiveSuite package ('.zip'), a configuration file (the second file),

...

a temporary installation directory, and the heritrix3 bundler zipfile as arguments (in the given order). The different ports used by the application for communication are included in the deploy_standalone_example.xml file.

In the configuration file all the applications are placed on one machine,

...

the current machine (localhost).

When the installation script is run it will unpack the installation files into

...

the netarchive/deploy

...

 directory as the current user and then - as the user "test" - install NetarchiveSuite into

...

the /home/test/

...

QUICKSTART directory (using ssh).

...

 

Remember that Sun/Oracle Java 8 is required for the Quickstart procedure.

If you already have a Quickstart installation, the existing bitarchive, database and admin.data files will be untouched. You must

...

explicitly remove any previous installation, if you want a clean empty installation.

To do the deployment:

Code Block

...

.

...

/RunNetarchiveSuite.sh NetarchiveSuite-5,2.zip deploy_standalone_example.xml deploy

...

 NetarchiveSuite-heritrix3-bundler-5.2.zip

Note that if you have not setup your automatic ssh test user login (using key based login), you need to login

...

multiple times before the installation

...

finishes successfully.

...

 

The script creates a deployment folder named "

...

QUICKSTART" under the test users home directory, which contains methods for starting and stopping NetarchiveSuite, and starts the whole NetarchiveSuite. The files to run the installation will be placed in the

...

directory deploy under the directory where you ran the RunNetarchiveSuite.sh command,

Now configure your browser:

  • Start a web browser.

...

  •  
  • Anchor
    proxy
    proxy
    Setup the browser to proxy on port 8070 on the host executing the netarchive, and exclude localhost and the hostname (used by the Heritrix GUI)

...

  • .  In Firefox it is done as follows:
Code Block

...

Choose in the 

...

Firefox toolbar:
Edit->Preferences->Advanced->Network->Settings
Checkmark:
Manual Proxy Configuration
and add:

...

Http proxy: 

...

name-of-host
Port: 8070

No Proxy for: localhost, 

...

name-

...

of-host
  • Write the following url in the

...

  • browser http://localhost:8074/HarvestDefinition (if running on the local machine, otherwise go to port 8074 on the host running NetarchiveSuite).  

    You should see the Netarchive harvest definition page:
    Image Added
    If you see a stacktrace instead of the Netarchive pages, this is most likely because the browser tries to go through the ViewerProxy.    Revisit the browser configuration and try again.
  • You can now see the webinterface in the browser. You can now create jobs, run them, and browse

...

  • output following the basic instructions in the rest of this manual, or the more complete description in the User Manual.
  • You can stop and start the entire NAS system with:
Code Block
ssh test@localhost
cd 

...

QUICKSTART
./conf/killall

...

.sh
./conf/startall

...

.sh
  • If you want to try other deploy examples, then go to "Examples of deploy configuration files" in the Installation Manual.

...

width100%

...