dk.netarkivet.archive.arcrepository.bitpreservation
Class FileBasedActiveBitPreservation

java.lang.Object
  extended by dk.netarkivet.archive.arcrepository.bitpreservation.FileBasedActiveBitPreservation
All Implemented Interfaces:
ActiveBitPreservation, CleanupIF

public class FileBasedActiveBitPreservation
extends java.lang.Object
implements ActiveBitPreservation, CleanupIF

Class handling integrity check of the arcrepository.

This class must run on the same machine as the arcrepository, as it uses the same admin data file (read-only). However, it still talks JMS with the arcrepository.


Constructor Summary
protected FileBasedActiveBitPreservation()
          Initialises a FileBasedActiveBitPreservation instance.
 
Method Summary
 void addMissingFilesToAdminData(java.lang.String... filename)
          Reestablish admin data to match bitarchive states for files.
 void changeStateForAdminData(java.lang.String filename)
          Reestablish admin data to match bitarchive states for file.
 void cleanup()
          Used to clean up a class from within a shutdown hook.
 void close()
          Shut down cleanly.
 void findChangedFiles(Replica replica)
          This method finds out which files in a given bitarchive are misrepresented in the admin data: Either having the wrong checksum or not being marked as uploaded when it actually is.
 void findMissingFiles(Replica replica)
          This method takes as input the name of a bitarchive replica for which we wish to run a FileListJob.
 java.lang.Iterable<java.lang.String> getChangedFiles(Replica bitarchive)
          Get a list of wrong files in a given bitarchive.
 java.lang.Iterable<java.lang.String> getChangedFilesForAdminData()
          Return a list of files with wrong checksum or status in admin data.
 java.util.Date getDateForChangedFiles(Replica replica)
          Get the date for last time the checksum information was updated for this replica.
 java.util.Date getDateForMissingFiles(Replica replica)
          Get the date for last time the missing files information was updated for this replica.
 FilePreservationState getFilePreservationState(java.lang.String filename)
          Get the details of the state of the given file in the bitarchives and admin data.
 java.util.Map<java.lang.String,FilePreservationState> getFilePreservationStateMap(java.lang.String... filenames)
          Retrieve the preservation status for the files with the given filenames.
static FileBasedActiveBitPreservation getInstance()
          Get singleton instance.
 java.lang.Iterable<java.lang.String> getMissingFiles(Replica bitarchive)
          Get a list of missing files in a given bitarchive.
 java.lang.Iterable<java.lang.String> getMissingFilesForAdminData()
          Return a list of files present in bitarchive but missing in AdminData.
 long getNumberOfChangedFiles(Replica bitarchive)
          Get the number of wrong files for a bitarchive.
 long getNumberOfFiles(Replica bitarchive)
          Return the number of files found in the bitarchive.
 long getNumberOfMissingFiles(Replica bitarchive)
          Get the number of missing files in a given bitarchive.
 void replaceChangedFile(Replica replica, java.lang.String filename, java.lang.String credentials, java.lang.String checksum)
          Check that file checksum is indeed different to admin data and reference replica.
 void uploadMissingFiles(Replica replica, java.lang.String... filenames)
          Check that files are indeed missing on the bitarchive replica, and present in admin data and reference replica.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileBasedActiveBitPreservation

protected FileBasedActiveBitPreservation()
Initialises a FileBasedActiveBitPreservation instance.

Method Detail

getInstance

public static FileBasedActiveBitPreservation getInstance()
Get singleton instance.

Returns:
the singleton.

getFilePreservationStateMap

public java.util.Map<java.lang.String,FilePreservationState> getFilePreservationStateMap(java.lang.String... filenames)
Retrieve the preservation status for the files with the given filenames. This will ask for a fresh checksum from the bitarchives and admin data.

Specified by:
getFilePreservationStateMap in interface ActiveBitPreservation
Parameters:
filenames - List of filenames
Returns:
a map ([filename]-> [FilePreservationState]) of the preservation status for the given files. The preservationstate is null, if the file named does not exist in admin data.
Throws:
ArgumentNotValid - if argument is null

getFilePreservationState

public FilePreservationState getFilePreservationState(java.lang.String filename)
Get the details of the state of the given file in the bitarchives and admin data.

Specified by:
getFilePreservationState in interface ActiveBitPreservation
Parameters:
filename - A given file
Returns:
the FilePreservationState for the given file. This will be null, if the filename is not found in admin data.

getMissingFiles

public java.lang.Iterable<java.lang.String> getMissingFiles(Replica bitarchive)
Get a list of missing files in a given bitarchive.

Specified by:
getMissingFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a given bitarchive
Returns:
a list of missing files in a given bitarchive.
Throws:
IllegalState - if the file with the list cannot be found.

findMissingFiles

public void findMissingFiles(Replica replica)
This method takes as input the name of a bitarchive replica for which we wish to run a FileListJob. It also reads in the known files in the arcrepository from the AdminData directory specified in the Setting DIRS_ARCREPOSITORY_ADMIN. The two file lists are compared and a subdirectory missingFiles is created with two unsorted files: 'missingba.txt' containing missing files, ie those registered in the admin data, but not found in the bitarchive, and 'missingadmindata.txt' containing extra files, ie. those found in the bitarchive but not in the arcrepository admin data. TODO The second file is never used on the current implementation. FIXME: It is unclear if the decision if which files are missing isn't better suited to be in getMissingFiles, so this method only runs the batch job.

Specified by:
findMissingFiles in interface ActiveBitPreservation
Parameters:
replica - the replica to search for missing files
Throws:
ArgumentNotValid - if the given directory does not contain a file filelistOutput/sorted.txt, or the argument replica is null
PermissionDenied - if the output directory cannot be created

getChangedFiles

public java.lang.Iterable<java.lang.String> getChangedFiles(Replica bitarchive)
Get a list of wrong files in a given bitarchive.

Specified by:
getChangedFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a bitarchive
Returns:
a list of wrong files in a given bitarchive.
Throws:
IllegalState - if the file with the list cannot be found.

findChangedFiles

public void findChangedFiles(Replica replica)
This method finds out which files in a given bitarchive are misrepresented in the admin data: Either having the wrong checksum or not being marked as uploaded when it actually is.

It uses the admindata file from the DIRS_ARCREPOSITORY_ADMIN directory, as well as the files output by a runChecksumJob. The erroneous files are stored in files. FIXME: It is unclear if the decision if which files are changed isn't better suited to be in getChangedFiles, so this method only runs the batch job.

Specified by:
findChangedFiles in interface ActiveBitPreservation
Parameters:
replica - the bitarchive replica the checksumjob came from
Throws:
IOFailure - On file or network trouble.
PermissionDenied - if the output directory cannot be created
ArgumentNotValid - if argument replica is null

getNumberOfFiles

public long getNumberOfFiles(Replica bitarchive)
Return the number of files found in the bitarchive. If nothing is known about the bitarchive replica, -1 is returned.

Specified by:
getNumberOfFiles in interface ActiveBitPreservation
Parameters:
bitarchive - the bitarchive to check
Returns:
the number of files found in the bitarchive. If nothing is known about the bitarchive replica, -1 is returned.

getNumberOfMissingFiles

public long getNumberOfMissingFiles(Replica bitarchive)
Get the number of missing files in a given bitarchive. If nothing is known about the bitarchive replica, -1 is returned.

Specified by:
getNumberOfMissingFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a given bitarchive
Returns:
the number of missing files in the given bitarchive. If nothing is known about the bitarchive replica, -1 is returned.

getNumberOfChangedFiles

public long getNumberOfChangedFiles(Replica bitarchive)
Get the number of wrong files for a bitarchive. If nothing is known about the bitarchive replica, -1 is returned.

Specified by:
getNumberOfChangedFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a bitarchive
Returns:
the number of wrong files for the bitarchive. If nothing is known about the bitarchive replica, -1 is returned.

getDateForChangedFiles

public java.util.Date getDateForChangedFiles(Replica replica)
Get the date for last time the checksum information was updated for this replica.

Specified by:
getDateForChangedFiles in interface ActiveBitPreservation
Parameters:
replica - The replica to check last time for.
Returns:
The date for last check. Will return 1970-01-01 for never.

getDateForMissingFiles

public java.util.Date getDateForMissingFiles(Replica replica)
Get the date for last time the missing files information was updated for this replica.

Specified by:
getDateForMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica to check last time for.
Returns:
The date for last check. Will return 1970-01-01 for never.

uploadMissingFiles

public void uploadMissingFiles(Replica replica,
                               java.lang.String... filenames)
Check that files are indeed missing on the bitarchive replica, and present in admin data and reference replica. If so, upload missing files from reference replica to this replica.

Specified by:
uploadMissingFiles in interface ActiveBitPreservation
Parameters:
replica - The replica to restore files to
filenames - The names of the files.
Throws:
IllegalState - If one of the files is unknown (For all known files, there will be an attempt at udpload)
IOFailure - If some file cannot be reestablished. All files will be attempted, though.

replaceChangedFile

public void replaceChangedFile(Replica replica,
                               java.lang.String filename,
                               java.lang.String credentials,
                               java.lang.String checksum)
Check that file checksum is indeed different to admin data and reference replica. If so, remove missing file and upload it from reference replica to this replica.

Specified by:
replaceChangedFile in interface ActiveBitPreservation
Parameters:
replica - The replica to restore file to
filename - The name of the file.
credentials - The credentials used to perform this replace operation
checksum - The expected checksum.
Throws:
IOFailure - if the file cannot be reestablished
PermissionDenied - if the file is not in correct state

getMissingFilesForAdminData

public java.lang.Iterable<java.lang.String> getMissingFilesForAdminData()
Return a list of files present in bitarchive but missing in AdminData.

Specified by:
getMissingFilesForAdminData in interface ActiveBitPreservation
Returns:
A list of missing files.
Throws:
IOFailure - if the list cannot be generated.

getChangedFilesForAdminData

public java.lang.Iterable<java.lang.String> getChangedFilesForAdminData()
Return a list of files with wrong checksum or status in admin data.

Specified by:
getChangedFilesForAdminData in interface ActiveBitPreservation
Returns:
A list of files with wrong checksum or status.
Throws:
IOFailure - if the list cannot be generated.

addMissingFilesToAdminData

public void addMissingFilesToAdminData(java.lang.String... filename)
Reestablish admin data to match bitarchive states for files.

Specified by:
addMissingFilesToAdminData in interface ActiveBitPreservation
Parameters:
filename - The files to reestablish state for.
Throws:
PermissionDenied - if the file is not in correct state

changeStateForAdminData

public void changeStateForAdminData(java.lang.String filename)
Reestablish admin data to match bitarchive states for file.

Specified by:
changeStateForAdminData in interface ActiveBitPreservation
Parameters:
filename - The file to reestablish state for.
Throws:
PermissionDenied - if the file is not in correct state

close

public void close()
Shut down cleanly.


cleanup

public void cleanup()
Description copied from interface: CleanupIF
Used to clean up a class from within a shutdown hook. Must not do any logging. Program defensively, please.

Specified by:
cleanup in interface CleanupIF
See Also:
CleanupIF.cleanup()