dk.netarkivet.archive.arcrepository.bitpreservation
Class FileBasedActiveBitPreservation

java.lang.Object
  extended by dk.netarkivet.archive.arcrepository.bitpreservation.FileBasedActiveBitPreservation
All Implemented Interfaces:
ActiveBitPreservation, CleanupIF

public class FileBasedActiveBitPreservation
extends java.lang.Object
implements ActiveBitPreservation, CleanupIF

Class handling integrity check of the arcrepository.

This class must run on the same machine as the arcrepository, as it uses the same admin data file (read-only). However, it still talks JMS with the arcrepository.


Constructor Summary
protected FileBasedActiveBitPreservation()
          Initialises a FileBasedActiveBitPreservation instance.
 
Method Summary
 void addMissingFilesToAdminData(java.lang.String... filename)
          Reestablish admin data to match bitarchive states for files.
 void changeStateForAdminData(java.lang.String filename)
          Reestablish admin data to match bitarchive states for file.
 void cleanup()
          Used to clean up a class from within a shutdown hook.
 void close()
          Shut down cleanly.
 void findChangedFiles(Location location)
          This method finds out which files in a given bitarchive are misrepresented in the admin data: Either having the wrong checksum or not being marked as uploaded when it actually is.
 void findMissingFiles(Location location)
          This method takes as input the name of a bitarchive location for which we wish to run a FileListJob.
 java.lang.Iterable<java.lang.String> getChangedFiles(Location bitarchive)
          Get a list of wrong files in a given bitarchive.
 java.lang.Iterable<java.lang.String> getChangedFilesForAdminData()
          Return a list of files with wrong checksum or status in admin data.
 java.util.Date getDateForChangedFiles(Location location)
          Get the date for last time the checksum information was updated for this location.
 java.util.Date getDateForMissingFiles(Location location)
          Get the date for last time the missing files information was updated for this location.
 FilePreservationState getFilePreservationState(java.lang.String filename)
          Get the details of the state of the given file in the bitarchives and admin data.
 java.util.Map<java.lang.String,FilePreservationState> getFilePreservationStateMap(java.lang.String... filenames)
          Retrieve the preservation status for the files with the given filenames.
static FileBasedActiveBitPreservation getInstance()
          Get singleton instance.
 java.lang.Iterable<java.lang.String> getMissingFiles(Location bitarchive)
          Get a list of missing files in a given bitarchive.
 java.lang.Iterable<java.lang.String> getMissingFilesForAdminData()
          Return a list of files present in bitarchive but missing in AdminData.
 long getNumberOfChangedFiles(Location bitarchive)
          Get the number of wrong files for a bitarchive.
 long getNumberOfFiles(Location bitarchive)
          Return the number of files found in the bitarchive.
 long getNumberOfMissingFiles(Location bitarchive)
          Get the number of missing files in a given bitarchive.
 void replaceChangedFile(Location location, java.lang.String filename, java.lang.String credentials, java.lang.String checksum)
          Check that file checksum is indeed different to admin data and reference location.
 void uploadMissingFiles(Location location, java.lang.String... filenames)
          Check that files are indeed missing on the bitarchive location, and present in admin data and reference location.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileBasedActiveBitPreservation

protected FileBasedActiveBitPreservation()
Initialises a FileBasedActiveBitPreservation instance.

Method Detail

getInstance

public static FileBasedActiveBitPreservation getInstance()
Get singleton instance.

Returns:
the singleton.

getFilePreservationStateMap

public java.util.Map<java.lang.String,FilePreservationState> getFilePreservationStateMap(java.lang.String... filenames)
Retrieve the preservation status for the files with the given filenames. This will ask for a fresh checksum from the bitarchives and admin data.

Specified by:
getFilePreservationStateMap in interface ActiveBitPreservation
Parameters:
filenames - List of filenames
Returns:
a map ([filename]-> [FilePreservationState]) of the preservation status for the given files. The preservationstate is null, if the file named does not exist in admin data.
Throws:
ArgumentNotValid - if argument is null

getFilePreservationState

public FilePreservationState getFilePreservationState(java.lang.String filename)
Get the details of the state of the given file in the bitarchives and admin data.

Specified by:
getFilePreservationState in interface ActiveBitPreservation
Parameters:
filename - A given file
Returns:
the FilePreservationState for the given file. This will be null, if the filename is not found in admin data.

getMissingFiles

public java.lang.Iterable<java.lang.String> getMissingFiles(Location bitarchive)
Get a list of missing files in a given bitarchive.

Specified by:
getMissingFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a given bitarchive
Returns:
a list of missing files in a given bitarchive.
Throws:
IllegalState - if the file with the list cannot be found.

findMissingFiles

public void findMissingFiles(Location location)
This method takes as input the name of a bitarchive location for which we wish to run a FileListJob. It also reads in the known files in the arcrepository from the AdminData directory specified in the Setting DIRS_ARCREPOSITORY_ADMIN. The two file lists are compared and a subdirectory missingFiles is created with two unsorted files: 'missingba.txt' containing missing files, ie those registered in the admin data, but not found in the bitarchive, and 'missingadmindata.txt' containing extra files, ie. those found in the bitarchive but not in the arcrepository admin data. TODO The second file is never used on the current implementation. FIXME: It is unclear if the decision if which files are missing isn't better suited to be in getMissingFiles, so this method only runs the batch job.

Specified by:
findMissingFiles in interface ActiveBitPreservation
Parameters:
location - the location to search for missing files
Throws:
ArgumentNotValid - if the given directory does not contain a file filelistOutput/sorted.txt, or the argument location is null
PermissionDenied - if the output directory cannot be created

getChangedFiles

public java.lang.Iterable<java.lang.String> getChangedFiles(Location bitarchive)
Get a list of wrong files in a given bitarchive.

Specified by:
getChangedFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a bitarchive
Returns:
a list of wrong files in a given bitarchive.
Throws:
IllegalState - if the file with the list cannot be found.

findChangedFiles

public void findChangedFiles(Location location)
This method finds out which files in a given bitarchive are misrepresented in the admin data: Either having the wrong checksum or not being marked as uploaded when it actually is.

It uses the admindata file from the DIRS_ARCREPOSITORY_ADMIN directory, as well as the files output by a runChecksumJob. The erroneous files are stored in files. FIXME: It is unclear if the decision if which files are changed isn't better suited to be in getChangedFiles, so this method only runs the batch job.

Specified by:
findChangedFiles in interface ActiveBitPreservation
Parameters:
location - the bitarchive location the checksumjob came from
Throws:
IOFailure - On file or network trouble.
PermissionDenied - if the output directory cannot be created
ArgumentNotValid - if argument location is null

getNumberOfFiles

public long getNumberOfFiles(Location bitarchive)
Return the number of files found in the bitarchive. If nothing is known about the bitarchive location, -1 is returned.

Specified by:
getNumberOfFiles in interface ActiveBitPreservation
Parameters:
bitarchive - the bitarchive to check
Returns:
the number of files found in the bitarchive. If nothing is known about the bitarchive location, -1 is returned.

getNumberOfMissingFiles

public long getNumberOfMissingFiles(Location bitarchive)
Get the number of missing files in a given bitarchive. If nothing is known about the bitarchive location, -1 is returned.

Specified by:
getNumberOfMissingFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a given bitarchive
Returns:
the number of missing files in the given bitarchive. If nothing is known about the bitarchive location, -1 is returned.

getNumberOfChangedFiles

public long getNumberOfChangedFiles(Location bitarchive)
Get the number of wrong files for a bitarchive. If nothing is known about the bitarchive location, -1 is returned.

Specified by:
getNumberOfChangedFiles in interface ActiveBitPreservation
Parameters:
bitarchive - a bitarchive
Returns:
the number of wrong files for the bitarchive. If nothing is known about the bitarchive location, -1 is returned.

getDateForChangedFiles

public java.util.Date getDateForChangedFiles(Location location)
Get the date for last time the checksum information was updated for this location.

Specified by:
getDateForChangedFiles in interface ActiveBitPreservation
Parameters:
location - The location to check last time for.
Returns:
The date for last check. Will return 1970-01-01 for never.

getDateForMissingFiles

public java.util.Date getDateForMissingFiles(Location location)
Get the date for last time the missing files information was updated for this location.

Specified by:
getDateForMissingFiles in interface ActiveBitPreservation
Parameters:
location - The location to check last time for.
Returns:
The date for last check. Will return 1970-01-01 for never.

uploadMissingFiles

public void uploadMissingFiles(Location location,
                               java.lang.String... filenames)
Check that files are indeed missing on the bitarchive location, and present in admin data and reference location. If so, upload missing files from reference location to this location.

Specified by:
uploadMissingFiles in interface ActiveBitPreservation
Parameters:
location - The location to restore files to
filenames - The names of the files.
Throws:
IllegalState - If one of the files is unknown (For all known files, there will be an attempt at udpload)
IOFailure - If some file cannot be reestablished. All files will be attempted, though.

replaceChangedFile

public void replaceChangedFile(Location location,
                               java.lang.String filename,
                               java.lang.String credentials,
                               java.lang.String checksum)
Check that file checksum is indeed different to admin data and reference location. If so, remove missing file and upload it from reference location to this location.

Specified by:
replaceChangedFile in interface ActiveBitPreservation
Parameters:
location - The location to restore file to
filename - The name of the file.
credentials - The credentials used to perform this replace operation
checksum - The expected checksum.
Throws:
IOFailure - if the file cannot be reestablished
PermissionDenied - if the file is not in correct state

getMissingFilesForAdminData

public java.lang.Iterable<java.lang.String> getMissingFilesForAdminData()
Return a list of files present in bitarchive but missing in AdminData.

Specified by:
getMissingFilesForAdminData in interface ActiveBitPreservation
Returns:
A list of missing files.
Throws:
IOFailure - if the list cannot be generated.

getChangedFilesForAdminData

public java.lang.Iterable<java.lang.String> getChangedFilesForAdminData()
Return a list of files with wrong checksum or status in admin data.

Specified by:
getChangedFilesForAdminData in interface ActiveBitPreservation
Returns:
A list of files with wrong checksum or status.
Throws:
IOFailure - if the list cannot be generated.

addMissingFilesToAdminData

public void addMissingFilesToAdminData(java.lang.String... filename)
Reestablish admin data to match bitarchive states for files.

Specified by:
addMissingFilesToAdminData in interface ActiveBitPreservation
Parameters:
filename - The files to reestablish state for.
Throws:
PermissionDenied - if the file is not in correct state

changeStateForAdminData

public void changeStateForAdminData(java.lang.String filename)
Reestablish admin data to match bitarchive states for file.

Specified by:
changeStateForAdminData in interface ActiveBitPreservation
Parameters:
filename - The file to reestablish state for.
Throws:
PermissionDenied - if the file is not in correct state

close

public void close()
Shut down cleanly.


cleanup

public void cleanup()
Description copied from interface: CleanupIF
Used to clean up a class from within a shutdown hook. Must not do any logging. Program defensively, please.

Specified by:
cleanup in interface CleanupIF
See Also:
CleanupIF.cleanup()