dk.netarkivet.archive.arcrepository.bitpreservation
Class ReplicaCacheDatabase

java.lang.Object
  extended by dk.netarkivet.archive.arcrepository.bitpreservation.ReplicaCacheDatabase
All Implemented Interfaces:
BitPreservationDAO, CleanupIF

public class ReplicaCacheDatabase
extends java.lang.Object
implements BitPreservationDAO

Method for storing the bitpreservation cache in a database. This method uses the 'admin.data' file for retrieving the upload status.


Field Summary
protected static org.apache.commons.logging.Log log
          The log.
 
Method Summary
 void addChecksumInformation(java.util.List<ChecksumEntry> checksumOutput, Replica replica)
          Given the output of a checksum job, add the results to the database.
 void addFileListInformation(java.util.List<java.lang.String> filelist, Replica replica)
          Method for adding the results from a list of filenames on a replica.
 void cleanup()
          Method for cleaning up.
 Replica getBitarchiveWithGoodFile(java.lang.String filename)
          Method for finding a replica with a valid version of a file.
 Replica getBitarchiveWithGoodFile(java.lang.String filename, Replica badReplica)
          Method for finding a replica with a valid version of a file.
 java.sql.Date getDateOfLastMissingFilesUpdate(Replica replica)
          Get the date for the last file list job.
 java.sql.Date getDateOfLastWrongFilesUpdate(Replica replica)
          Method for retrieving the date for the last update for corrupted files.
static ReplicaCacheDatabase getInstance()
          Method for retrieving the current instance of this class.
 java.lang.Iterable<java.lang.String> getMissingFilesInLastUpdate(Replica replica)
          Method for retrieving the list of the names of the files which was missing for the replica in the last filelist update.
 long getNumberOfFiles(Replica replica)
          Method for retrieving the number of files within a replica.
 long getNumberOfMissingFilesInLastUpdate(Replica replica)
          Method for retrieving the number of files missing from a specific replica.
 long getNumberOfWrongFilesInLastUpdate(Replica replica)
          Method for retrieving the amount of files with a incorrect checksum within a replica.
 java.lang.Iterable<java.lang.String> getWrongFilesInLastUpdate(Replica replica)
          Method for retrieving the list of the files in the replica which have a incorrect checksum.
protected  void initialiseDB()
          Method for initialising the database.
 void print()
          Method to print all the tables in the database.
 FileListStatus retrieveFileListStatus(java.lang.String filename, Replica replica)
          Method for retrieving the filelist_status for a replicafileinfo entry.
 void updateChecksumStatus()
          This method is used to update the status for the checksums for all replicafileinfo entries.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.commons.logging.Log log
The log.

Method Detail

getInstance

public static ReplicaCacheDatabase getInstance()
Method for retrieving the current instance of this class.

Returns:
The current instance.

initialiseDB

protected void initialiseDB()
Method for initialising the database. This basically makes sure that all the replicas are within the database, and that no unknown replicas have been defined.


retrieveFileListStatus

public FileListStatus retrieveFileListStatus(java.lang.String filename,
                                             Replica replica)
                                      throws ArgumentNotValid
Method for retrieving the filelist_status for a replicafileinfo entry.

Parameters:
filename - The name of the file.
replica - The replica where the file should be.
Returns:
The filelist_status for the file in the replica.
Throws:
ArgumentNotValid - If the replica is null or the filename is either null or the empty string.

updateChecksumStatus

public void updateChecksumStatus()
This method is used to update the status for the checksums for all replicafileinfo entries.

For each file in the database, the checksum vote is made in the following way.
Each entry in the replicafileinfo table containing the file is retrieved. All the unique checksums are retrieved, e.g. if a checksum is found more than one, then it is ignored.
If only one unique checksum is found, then if must be the correct one, and all the replicas with this file will have their checksum_status set to 'OK'.
If more than one checksum is found, then a vote for the correct checksum is performed. This is done by counting the amount of time each of the unique checksum is found among the replicafileinfo entries for the current file. The checksum with most votes is chosen as the correct one, and the checksum_status for all the replicafileinfo entries with this checksum is set to 'OK', whereas the replicafileinfo entries with a different checksum is set to 'CORRUPT'.
If no winner is found then a warning and a notification is issued, and the checksum_status for all the replicafileinfo entries with for the current file is set to 'UNKNOWN'.

Specified by:
updateChecksumStatus in interface BitPreservationDAO

addChecksumInformation

public void addChecksumInformation(java.util.List<ChecksumEntry> checksumOutput,
                                   Replica replica)
Given the output of a checksum job, add the results to the database. The following fields in the table are updated for each corresponding entry in the replicafileinfo table:
- checksum = the given checksum.
- filelist_status = ok.
- filelist_checkdatetime = now.
- checksum_checkdatetime = now.

Specified by:
addChecksumInformation in interface BitPreservationDAO
Parameters:
checksumOutput - The output of a checksum job.
replica - The replica this checksum job is for.

addFileListInformation

public void addFileListInformation(java.util.List<java.lang.String> filelist,
                                   Replica replica)
                            throws ArgumentNotValid
Method for adding the results from a list of filenames on a replica. This list of filenames should return the list of all the files within the database. For each file in the FileListJob the following fields are set for the corresponding entry in the replicafileinfo table:
- filelist_status = ok.
- filelist_checkdatetime = now. For each entry in the replicafileinfo table for the replica which are missing in the results from the FileListJob the following fields are assigned the following values:
- filelist_status = missing.
- filelist_checkdatetime = now.

Specified by:
addFileListInformation in interface BitPreservationDAO
Parameters:
filelist - The list of filenames either parsed from a FilelistJob or the result from a GetAllFilenamesMessage.
replica - The replica, which the FilelistBatchjob has run upon.
Throws:
ArgumentNotValid - If the filelist or the replica is null.

getDateOfLastMissingFilesUpdate

public java.sql.Date getDateOfLastMissingFilesUpdate(Replica replica)
                                              throws ArgumentNotValid
Get the date for the last file list job.

Specified by:
getDateOfLastMissingFilesUpdate in interface BitPreservationDAO
Parameters:
replica - The replica to get the date for.
Returns:
The date of the last missing files update for the replica. A null is returned if no last missing files update has been performed.
Throws:
ArgumentNotValid - If the replica is null.

getDateOfLastWrongFilesUpdate

public java.sql.Date getDateOfLastWrongFilesUpdate(Replica replica)
                                            throws ArgumentNotValid
Method for retrieving the date for the last update for corrupted files. This method does not contact the replicas, it only retrieves the data from the last time the checksum was retrieved.

Specified by:
getDateOfLastWrongFilesUpdate in interface BitPreservationDAO
Parameters:
replica - The replica to find the date for the latest update for corruption of files.
Returns:
The date for the last checksum update. A null is returned if no wrong files update has been performed for this replica.
Throws:
ArgumentNotValid - If the replica is null.

getNumberOfMissingFilesInLastUpdate

public long getNumberOfMissingFilesInLastUpdate(Replica replica)
                                         throws ArgumentNotValid
Method for retrieving the number of files missing from a specific replica. This method does not contact the replica directly, it only retrieves the count of missing files from the last filelist update.

Specified by:
getNumberOfMissingFilesInLastUpdate in interface BitPreservationDAO
Parameters:
replica - The replica to find the number of missing files for.
Returns:
The number of missing files for the replica.
Throws:
ArgumentNotValid - If the replica is null.

getMissingFilesInLastUpdate

public java.lang.Iterable<java.lang.String> getMissingFilesInLastUpdate(Replica replica)
                                                                 throws ArgumentNotValid
Method for retrieving the list of the names of the files which was missing for the replica in the last filelist update. This method does not contact the replica, it only uses the database to find the files, which was missing during the last filelist update.

Specified by:
getMissingFilesInLastUpdate in interface BitPreservationDAO
Parameters:
replica - The replica to find the list of missing files for.
Returns:
A list containing the names of the files which are missing in the given replica.
Throws:
ArgumentNotValid - If the replica is null.

getNumberOfWrongFilesInLastUpdate

public long getNumberOfWrongFilesInLastUpdate(Replica replica)
                                       throws ArgumentNotValid
Method for retrieving the amount of files with a incorrect checksum within a replica. This method does not contact the replica, it only uses the database to count the amount of files which are corrupt.

Specified by:
getNumberOfWrongFilesInLastUpdate in interface BitPreservationDAO
Parameters:
replica - The replica to find the number of corrupted files for.
Returns:
The number of corrupted files.
Throws:
ArgumentNotValid - If the replica is null.

getWrongFilesInLastUpdate

public java.lang.Iterable<java.lang.String> getWrongFilesInLastUpdate(Replica replica)
                                                               throws ArgumentNotValid
Method for retrieving the list of the files in the replica which have a incorrect checksum. E.g. the checksum_status is set to CORRUPT. This method does not contact the replica, it only uses the local database.

Specified by:
getWrongFilesInLastUpdate in interface BitPreservationDAO
Parameters:
replica - The replica to find the list of corrupted files for.
Returns:
The list of files which have wrong checksums.
Throws:
ArgumentNotValid - If the replica is null.

getNumberOfFiles

public long getNumberOfFiles(Replica replica)
                      throws ArgumentNotValid
Method for retrieving the number of files within a replica. This count all the files which are not missing from the replica, thus all entries in the replicafileinfo table which has the filelist_status set to OK. It is ignored whether the files has a correct checksum. This method does not contact the replica, it only uses the local database.

Specified by:
getNumberOfFiles in interface BitPreservationDAO
Parameters:
replica - The replica to count the number of files for.
Returns:
The number of files within the replica.
Throws:
ArgumentNotValid - If the replica is null.

getBitarchiveWithGoodFile

public Replica getBitarchiveWithGoodFile(java.lang.String filename)
                                  throws ArgumentNotValid
Method for finding a replica with a valid version of a file. This method is used in order to find a replica from which a file should be retrieved, during the process of restoring a corrupt file on another replica. This replica must of the type bitarchive, since a file cannot be retrieved from a checksum replica.

Specified by:
getBitarchiveWithGoodFile in interface BitPreservationDAO
Parameters:
filename - The name of the file which needs to have a valid version in a bitarchive.
Returns:
A bitarchive which contains a valid version of the file, or null if no such bitarchive exists.
Throws:
ArgumentNotValid - If the filename is null or the empty string.

getBitarchiveWithGoodFile

public Replica getBitarchiveWithGoodFile(java.lang.String filename,
                                         Replica badReplica)
                                  throws ArgumentNotValid
Method for finding a replica with a valid version of a file. This method is used in order to find a replica from which a file should be retrieved, during the process of restoring a corrupt file on another replica. This replica must of the type bitarchive, since a file cannot be retrieved from a checksum replica.

Specified by:
getBitarchiveWithGoodFile in interface BitPreservationDAO
Parameters:
filename - The name of the file which needs to have a valid version in a bitarchive.
badReplica - The Replica which has a bad copy of the given file
Returns:
A bitarchive which contains a valid version of the file, or null if no such bitarchive exists.
Throws:
ArgumentNotValid - If the replica is null or the filename is either null or the empty string.

print

public void print()
Method to print all the tables in the database. FIXME This is only used during implementation. Kill me afterwards!


cleanup

public void cleanup()
Method for cleaning up.

Specified by:
cleanup in interface BitPreservationDAO
Specified by:
cleanup in interface CleanupIF