Takes a seed list and creates any necessary domains, configurations, and
seedlists to enable them to be harvested with the given template and
other parameters.
An instance of this class is sent by a bitarchive machine (Bitarchive class)
to a BitarchiveMonitorServer to indicate that that single machine has
finished processing a batch job.
Message to signal from a BitarchiveServer to the BitarchiveMonitorServer
that the Bit Archive Application identified by
BA_ApplicationId has completed its part of the batch job.
Message to signal from a BitarchiveServer to the BitarchiveMonitorServer
that the Bit Archive Application identified by
BA_ApplicationId has completed its part of the batch job.
If we haven't heard from a bit archive within this many milliseconds,
we don't excpect it to be online and won't wait for them to reply on a
batch job.
The BitarchiveMonitorServer will listen for BatchEndedMessages for this
many milliseconds before it decides that a batch job is taking too long
and returns just the replies it has received at that point.
Class to hold the result of a lookup operation in the bitarchive:
The metadata information associated with the record
The actual byte content
The name of the file the data were retrieved from
If length of record exceeds value of
Settings.BITARCHIVE_LIMIT_FOR_RECORD_DATATRANSFER_IN_FILE
The record is stored in a RemoteFile.
Compares two entries according to first their location, then their
machine name, then their ports, and then
their application name, and then their index.
This method finds out which files in a given bitarchive are
misrepresented in the admin data: Either having the wrong checksum or not
being marked as uploaded when it actually is.
Gets list of all domains in the order expected by snapshot harvest
job generation, that is order by
template name, then byte limit (descending), then domain name.
Returns the unique instance of this class
The server creates an instance of the bitarchive it provides access to
and starts to listen to JMS messages on the incomming jms queue
Also, heartbeats are sent out at regular intervals to the Bitarchive
Monitor, to tell that this bitarchive is alive.
Returns or creates the unique instance of this singleton
The server creates an instance of the HarvestController,
uploads arc-files from unfinshed harvests, and
starts to listen to JMS messages on the incoming jms queues.
Given a file in sorted order and a prefix to search for, return a
an iterable that will return the lines in the files that start with
the prefix, in order.
This application controls the Heritrix harvester which does the actual
harvesting, and is also responsible for uploading the harvested data to the
ArcRepository.
Deprecated.use hasEntry(filename) instead of hasChecksum(filename).
If hasEntry(filename) is true, we have a recorded a
checksum for the filename as well.
Processes a checksum request: Either sets the checksum for a given
file ("file" parameter) in the arcrepository (if "fixadminchecksum"
parameter is given) or removes and reuploads a file in one bitarchive
("bitarchive" parameter) checking with the checksum and credentials
given.
Extracts all required parameters from the request, checks for any
inconsistencies, and passes the requisite data to the updateDomain method
for processing.
Extracts all required parameters from the request, checks for any
inconsistencies, and passes the requisite data to the updateDomain method
for processing.
Extracts information from a servlet request to update seedlists in a
domain
editUrlList: if not null, we are editing, not updating so return
(urlListName, seedlist) The name of a seedlist and the actual seedlist
for a seedlist to be updated.
Extracts all required parameters from the request, checks for
any inconsistencies, and passes the requisite data to the
updateHarvestDefinition method for processing.
Extract the name of the bitarchive (parameter 'bitarchive') and whether
to update missing files (parameter "findmissingfiles") or checksums (parameter "checksum").
This class holds information about one section of the site, including
information about what to put in the menu sidebar and how to determine
which page you're in.