dk.netarkivet.archive.arcrepository.bitpreservation
Class DatabaseBasedActiveBitPreservation

java.lang.Object
  extended by dk.netarkivet.archive.arcrepository.bitpreservation.DatabaseBasedActiveBitPreservation
All Implemented Interfaces:
ActiveBitPreservation, CleanupIF

public class DatabaseBasedActiveBitPreservation
extends java.lang.Object
implements ActiveBitPreservation, CleanupIF


Field Summary
protected static org.apache.commons.logging.Log log
          The log.
 
Method Summary
 void addMissingFilesToAdminData(java.lang.String... filenames)
          Add files unknown in admin.data to admin.data.
 void changeStateForAdminData(java.lang.String filename)
          Reestablish admin data to match bitarchive states for file.
 void cleanup()
          Used to clean up a class from within a shutdown hook.
 void close()
          Method for closing the running instance of this class.
 void findChangedFiles(Replica replica)
          The method is used to update the checksum for all the files in a replica.
 void findMissingFiles(Replica replica)
          This method retrieves the filelist for the replica, and then it updates the database with this list of filenames.
 java.lang.Iterable<java.lang.String> getChangedFiles(Replica replica)
          This method retrieves the name of all the files which has a wrong checksum for the replica.
 java.lang.Iterable<java.lang.String> getChangedFilesForAdminData()
          Return a list of files with wrong checksum or state in admin data.
 java.sql.Date getDateForChangedFiles(Replica replica)
          This method retrieves the date for the latest checksum update was performed for the replica.
 java.sql.Date getDateForMissingFiles(Replica replica)
          This method retrieves the date for the latest filelist update was performed for the replica.
 FilePreservationState getFilePreservationState(java.lang.String filename)
          Get the details of the state of the given file in the bitarchives and admin data.
 java.util.Map<java.lang.String,FilePreservationState> getFilePreservationStateMap(java.lang.String... filenames)
          Get details of the state of one or more files in the bitarchives and admin data.
static DatabaseBasedActiveBitPreservation getInstance()
          Method for retrieving the current instance of this class.
 java.lang.Iterable<java.lang.String> getMissingFiles(Replica replica)
          This method retrieves the name of all the files which are missing for the given replica.
 java.lang.Iterable<java.lang.String> getMissingFilesForAdminData()
          Return a list of files represented in replica but missing in AdminData.
 long getNumberOfChangedFiles(Replica replica)
          The method calculates the number of files which has a wrong checksum for the replica.
 long getNumberOfFiles(Replica replica)
          This method finds the number of files which are known to be in the archive of a specific replica.
 long getNumberOfMissingFiles(Replica replica)
          This method calculates the number of files which are not found in the given replica.
 void rebuildDatabase()
          This is a method to recreate the database, if it somehow has been lost.
 void replaceChangedFile(Replica replica, java.lang.String filename, java.lang.String credentials, java.lang.String checksum)
          Check that the checksum of the file is indeed different to the value in admin data and reference replica.
 void uploadMissingFiles(Replica replica, java.lang.String... filenames)
          This method is used to upload missing files to a replica.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.commons.logging.Log log
The log.

Method Detail

getInstance

public static DatabaseBasedActiveBitPreservation getInstance()
Method for retrieving the current instance of this class.

Returns:
The instance.

rebuildDatabase

public void rebuildDatabase()
This is a method to recreate the database, if it somehow has been lost. It just calls all the replicas and retrieves the checksum for each replica in the ChecksumEntry form, which contain both the checksum and the filenames. Thus the names of all the files within any replica should be retrieved, along with at least one checksum. These checksum results should then be put into the database.


getNumberOfChangedFiles

public long getNumberOfChangedFiles(Replica replica)
The method calculates the number of files which has a wrong checksum for the replica. This simple counts all the entries in the replicafileinfo table for the replica where the filelist_status is set to CORRUPT.

Specified by:
getNumberOfChangedFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which to count the number of changed files.
Returns:
The number of files for the replica where the checksum does not correspond to the checksum of the same file in the other replicas.

getChangedFiles

public java.lang.Iterable<java.lang.String> getChangedFiles(Replica replica)
This method retrieves the name of all the files which has a wrong checksum for the replica. It simple returns the filename of all the entries in the replicafileinfo table for the replica where the filelist_status is set to CORRUPT.

Specified by:
getChangedFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which the changed files should be found.
Returns:
The names of files in the replica where the checksum does not correspond to the checksum of the same file in the other replicas.

getNumberOfMissingFiles

public long getNumberOfMissingFiles(Replica replica)
This method calculates the number of files which are not found in the given replica. This simple counts all the entries in the replicafileinfo table for the replica where the filelist_status is set to MISSING.

Specified by:
getNumberOfMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which to count the number of missing files.
Returns:
The number of files which is missing in the replica.

getMissingFiles

public java.lang.Iterable<java.lang.String> getMissingFiles(Replica replica)
This method retrieves the name of all the files which are missing for the given replica. It simple returns the filename of all the entries in the replicafileinfo table for the replica where the filelist_status is set to MISSING.

Specified by:
getMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which the missing files should be found.
Returns:
The names of files in the replica which are missing.

getDateForChangedFiles

public java.sql.Date getDateForChangedFiles(Replica replica)
This method retrieves the date for the latest checksum update was performed for the replica. This means the date for the latest the replica has calculated the checksum of all the files within its archive. This method does not call out to the replicas. It only contacts the local database.

Specified by:
getDateForChangedFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which the date for last checksum update should be retrieved.
Returns:
The date for the last time the checksums has been update. If the checksum update has never occurred, then a null is returned.

getDateForMissingFiles

public java.sql.Date getDateForMissingFiles(Replica replica)
This method retrieves the date for the latest filelist update was performed for the replica. This means the date for the latest the replica has retrieved the list of all the files within the archive. This method does not call out to the replicas. It only contacts the local database.

Specified by:
getDateForMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which the date for last filelist update should be retrieved.
Returns:
The date for the last time the filelist has been update. If the filelist update has never occurred, then a null is returned.

findChangedFiles

public void findChangedFiles(Replica replica)
The method is used to update the checksum for all the files in a replica. The checksum for the replica is retrieved either through a ChecksumJob (for a bitarchive) or through a GetAllChecksumMessage (for a checksumarchive). This will take a very large amount of time for the bitarchive, but a more limited amount of time for the checksumarchive. The corresponding replicafileinfo entries in the database for the retrieved checksum results will be updated. Then a checksum update will be performed to check for corrupted replicafileinfo.

Specified by:
findChangedFiles in interface ActiveBitPreservation
Parameters:
replica - The replica to find the changed files for.

findMissingFiles

public void findMissingFiles(Replica replica)
This method retrieves the filelist for the replica, and then it updates the database with this list of filenames.

Specified by:
findMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica to find the missing files for.

getFilePreservationState

public FilePreservationState getFilePreservationState(java.lang.String filename)
Description copied from interface: ActiveBitPreservation
Get the details of the state of the given file in the bitarchives and admin data.

Specified by:
getFilePreservationState in interface ActiveBitPreservation
Parameters:
filename - A given file
Returns:
the FilePreservationState for the given file. This will be null, if the filename is not found in admin data.

getFilePreservationStateMap

public java.util.Map<java.lang.String,FilePreservationState> getFilePreservationStateMap(java.lang.String... filenames)
Description copied from interface: ActiveBitPreservation
Get details of the state of one or more files in the bitarchives and admin data.

Specified by:
getFilePreservationStateMap in interface ActiveBitPreservation
Parameters:
filenames - the list of filenames to investigate
Returns:
a map ([filename]-> [FilePreservationState]) with the preservationstate of all files in the list. The preservationstates in the map will be null for all filenames, that are not found in admin data.

getNumberOfFiles

public long getNumberOfFiles(Replica replica)
This method finds the number of files which are known to be in the archive of a specific replica. This method will not go out to the replica, but only contact the local database. The number of files in the replica is retrieved from the database by counting the amount of files in the replicafileinfo table which belong to the replica and which has the filelist_status set to OK.

Specified by:
getNumberOfFiles in interface ActiveBitPreservation
Parameters:
replica - The replica for which the number of files should be counted.
Returns:
The number of files for a specific replica.

replaceChangedFile

public void replaceChangedFile(Replica replica,
                               java.lang.String filename,
                               java.lang.String credentials,
                               java.lang.String checksum)
Check that the checksum of the file is indeed different to the value in admin data and reference replica. If so, remove missing file and upload it from reference replica to this replica.

Specified by:
replaceChangedFile in interface ActiveBitPreservation
Parameters:
replica - The replica to restore file to
filename - The name of the file
credentials - The credentials used to perform this replace operation
checksum - The known bad checksum. Only a file with this bad checksum is attempted repaired.
Throws:
IOFailure - if the file cannot be reestablished.
PermissionDenied - if the file is not in correct state.
ArgumentNotValid - if any of the arguments are not valid.

uploadMissingFiles

public void uploadMissingFiles(Replica replica,
                               java.lang.String... filenames)
This method is used to upload missing files to a replica. For each file a good version of this file is found, and it is reestablished on the replicas where it is missing.

Specified by:
uploadMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica where the files are missing.
filenames - The names of the files which are missing in the given replica.

changeStateForAdminData

public void changeStateForAdminData(java.lang.String filename)
Description copied from interface: ActiveBitPreservation
Reestablish admin data to match bitarchive states for file.

Specified by:
changeStateForAdminData in interface ActiveBitPreservation
Parameters:
filename - The file to reestablish state for.

getMissingFilesForAdminData

public java.lang.Iterable<java.lang.String> getMissingFilesForAdminData()
Description copied from interface: ActiveBitPreservation
Return a list of files represented in replica but missing in AdminData.

Specified by:
getMissingFilesForAdminData in interface ActiveBitPreservation
Returns:
A list of missing files.

getChangedFilesForAdminData

public java.lang.Iterable<java.lang.String> getChangedFilesForAdminData()
Description copied from interface: ActiveBitPreservation
Return a list of files with wrong checksum or state in admin data.

Specified by:
getChangedFilesForAdminData in interface ActiveBitPreservation
Returns:
A list of files with wrong checksum or state.

addMissingFilesToAdminData

public void addMissingFilesToAdminData(java.lang.String... filenames)
Description copied from interface: ActiveBitPreservation
Add files unknown in admin.data to admin.data.

Specified by:
addMissingFilesToAdminData in interface ActiveBitPreservation
Parameters:
filenames - The files to add.

close

public void close()
Method for closing the running instance of this class.


cleanup

public void cleanup()
Description copied from interface: CleanupIF
Used to clean up a class from within a shutdown hook. Must not do any logging. Program defensively, please.

Specified by:
cleanup in interface CleanupIF