Class ReplicaCacheDatabase

  • All Implemented Interfaces:
    BitPreservationDAO, CleanupIF

    public final class ReplicaCacheDatabase
    extends Object
    implements BitPreservationDAO
    Method for storing the bitpreservation cache in a database.

    This method uses the 'admin.data' file for retrieving the upload status.

    • Field Detail

      • log

        protected static final org.slf4j.Logger log
        The log.
      • updateChecksumStatusSql

        public static final String updateChecksumStatusSql
        SQL used to update the checksum status of straightforward cases. See complete description for method below.
      • selectForFileChecksumVotingSql

        public static final String selectForFileChecksumVotingSql
        SQL used to select those files whose check status has to be voted on. See complete description for method below.
    • Method Detail

      • getInstance

        public static ReplicaCacheDatabase getInstance()
        Method for retrieving the current instance of this class.
        Returns:
        The current instance.
      • initialiseDB

        protected void initialiseDB​(Connection connection)
        Method for initialising the database. This basically makes sure that all the replicas are within the database, and that no unknown replicas have been defined.
        Parameters:
        connection - An open connection to the archive database
      • getReplicaFileInfo

        public ReplicaFileInfo getReplicaFileInfo​(String filename,
                                                  Replica replica)
                                           throws ArgumentNotValid
        Method for retrieving the entry in the replicafileinfo table for a given file and replica.
        Specified by:
        getReplicaFileInfo in interface BitPreservationDAO
        Parameters:
        filename - The name of the file for the entry.
        replica - The replica of the entry.
        Returns:
        The replicafileinfo entry corresponding to the given filename and replica.
        Throws:
        ArgumentNotValid - If the filename is either null or empty, or if the replica is null.
      • getChecksum

        public String getChecksum​(String filename)
                           throws ArgumentNotValid
        Method for retrieving the checksum for a specific file. Since a file is not directly attached with a checksum, the checksum of a file must be found by having the replicafileinfo entries for the file vote about it.
        Parameters:
        filename - The name of the file, whose checksum are to be found.
        Returns:
        The checksum of the file, or a Null if no validated checksum can be found.
        Throws:
        ArgumentNotValid - If the filename is either null or the empty string.
      • retrieveAllFilenames

        public Collection<String> retrieveAllFilenames()
        Retrieves the names of all the files in the file table of the database.
        Returns:
        The list of filenames known by the database.
      • getReplicaStoreState

        public ReplicaStoreState getReplicaStoreState​(String filename,
                                                      String replicaId)
                                               throws ArgumentNotValid
        Retrieves the ReplicaStoreState for the entry in the replicafileinfo table, which refers to the given file and replica.
        Parameters:
        filename - The name of the file in the filetable.
        replicaId - The id of the replica.
        Returns:
        The ReplicaStoreState for the specified entry.
        Throws:
        ArgumentNotValid - If the replicaId or the filename are eihter null or the empty string.
      • setReplicaStoreState

        public void setReplicaStoreState​(String filename,
                                         String replicaId,
                                         ReplicaStoreState state)
                                  throws ArgumentNotValid
        Sets the ReplicaStoreState for the entry in the replicafileinfo table.
        Parameters:
        filename - The name of the file in the filetable.
        replicaId - The id of the replica.
        state - The ReplicaStoreState for the specified entry.
        Throws:
        ArgumentNotValid - If the replicaId or the filename are eihter null or the empty string. Or if the ReplicaStoreState is null.
      • insertNewFileForUpload

        public void insertNewFileForUpload​(String filename,
                                           String checksum)
                                    throws ArgumentNotValid,
                                           IllegalState
        Creates a new entry for the filename for each replica, and give it the given checksum and set the upload_status = UNKNOWN_UPLOAD_STATUS.
        Parameters:
        filename - The name of the file.
        checksum - The checksum of the file.
        Throws:
        ArgumentNotValid - If the filename or the checksum is either null or the empty string.
        IllegalState - If the file exists with another checksum on one of the replicas. Or if the file has already been completely uploaded to one of the replicas.
      • changeStateOfReplicafileinfo

        public void changeStateOfReplicafileinfo​(String filename,
                                                 Replica replica,
                                                 ReplicaStoreState state)
                                          throws ArgumentNotValid
        Method for inserting an entry into the database about a file upload has begun for a specific replica. It is not tested whether the entry has another checksum or another UploadStatus.
        Parameters:
        filename - The name of the file.
        replica - The replica for the replicafileinfo.
        state - The new ReplicaStoreState for the entry.
        Throws:
        ArgumentNotValid - If the filename is either null or the empty string. Or if the replica or the status is null.
      • changeStateOfReplicafileinfo

        public void changeStateOfReplicafileinfo​(String filename,
                                                 String checksum,
                                                 Replica replica,
                                                 ReplicaStoreState state)
                                          throws ArgumentNotValid,
                                                 IllegalState
        Method for inserting an entry into the database about a file upload has begun for a specific replica. It is not tested whether the entry has another checksum or another UploadStatus.
        Parameters:
        filename - The name of the file.
        checksum - The new checksum for the entry.
        replica - The replica for the replicafileinfo.
        state - The new ReplicaStoreState for the entry.
        Throws:
        ArgumentNotValid - If the filename or the checksum is either null or the empty string. Or if the replica or the status is null.
        IllegalState - If an sql exception is thrown.
      • retrieveFilenamesForReplicaEntries

        public Collection<String> retrieveFilenamesForReplicaEntries​(String replicaId,
                                                                     ReplicaStoreState state)
                                                              throws ArgumentNotValid
        Retrieves the names of all the files in the given replica which has the specified UploadStatus.
        Parameters:
        replicaId - The id of the replica which contain the files.
        state - The ReplicaStoreState for the wanted files.
        Returns:
        The list of filenames for the entries in the replica which has the specified UploadStatus.
        Throws:
        ArgumentNotValid - If the UploadStatus is null or if the replicaId is either null or the empty string.
      • existsFileInDB

        public boolean existsFileInDB​(String filename)
                               throws IllegalState
        Checks whether a file is already in the file table in the database.
        Parameters:
        filename - The name of the file in the database.
        Returns:
        Whether the file was found in the database.
        Throws:
        IllegalState - If more than one entry with the given filename was found.
      • retrieveFileListStatus

        public FileListStatus retrieveFileListStatus​(String filename,
                                                     Replica replica)
                                              throws ArgumentNotValid
        Method for retrieving the filelist_status for a replicafileinfo entry.
        Parameters:
        filename - The name of the file.
        replica - The replica where the file should be.
        Returns:
        The filelist_status for the file in the replica.
        Throws:
        ArgumentNotValid - If the replica is null or the filename is either null or the empty string.
      • updateChecksumStatus

        public void updateChecksumStatus()
        This method is used to update the status for the checksums for all replicafileinfo entries.

        For each file in the database, the checksum vote is made in the following way.
        Each entry in the replicafileinfo table containing the file is retrieved. All the unique checksums are retrieved, e.g. if a checksum is found more than one, then it is ignored.
        If only one unique checksum is found, then if must be the correct one, and all the replicas with this file will have their checksum_status set to 'OK'.
        If more than one checksum is found, then a vote for the correct checksum is performed. This is done by counting the amount of time each of the unique checksum is found among the replicafileinfo entries for the current file. The checksum with most votes is chosen as the correct one, and the checksum_status for all the replicafileinfo entries with this checksum is set to 'OK', whereas the replicafileinfo entries with a different checksum is set to 'CORRUPT'.
        If no winner is found then a warning and a notification is issued, and the checksum_status for all the replicafileinfo entries with for the current file is set to 'UNKNOWN'.
        Specified by:
        updateChecksumStatus in interface BitPreservationDAO
      • updateChecksumStatus

        public void updateChecksumStatus​(String filename)
                                  throws ArgumentNotValid
        Method for updating the status for a specific file for all the replicas. If the checksums for the replicas differ for some replica, then based on a checksum vote, a specific checksum is chosen as the 'correct' one, and the entries with another checksum than the 'correct one' will be marked as corrupt.
        Specified by:
        updateChecksumStatus in interface BitPreservationDAO
        Parameters:
        filename - The name of the file to update the status for.
        Throws:
        ArgumentNotValid - If the filename is either null or the empty string.
      • addChecksumInformation

        public void addChecksumInformation​(File checksumOutputFile,
                                           Replica replica)
        Given the output of a checksum job, add the results to the database.

        The following fields in the table are updated for each corresponding entry in the replicafileinfo table:
        - checksum = the given checksum.
        - filelist_status = ok.
        - filelist_checkdatetime = now.
        - checksum_checkdatetime = now.

        Specified by:
        addChecksumInformation in interface BitPreservationDAO
        Parameters:
        checksumOutputFile - The output of a checksum job in a file
        replica - The replica this checksum job is for.
      • addFileListInformation

        public void addFileListInformation​(File filelistFile,
                                           Replica replica)
                                    throws ArgumentNotValid,
                                           UnknownID
        Method for adding the results from a list of filenames on a replica. This list of filenames should return the list of all the files within the database.

        For each file in the FileListJob the following fields are set for the corresponding entry in the replicafileinfo table:
        - filelist_status = ok.
        - filelist_checkdatetime = now.

        For each entry in the replicafileinfo table for the replica which are missing in the results from the FileListJob the following fields are assigned the following values:
        - filelist_status = missing.
        - filelist_checkdatetime = now.

        Specified by:
        addFileListInformation in interface BitPreservationDAO
        Parameters:
        filelistFile - The list of filenames either parsed from a FilelistJob or the result from a GetAllFilenamesMessage.
        replica - The replica, which the FilelistBatchjob has run upon.
        Throws:
        ArgumentNotValid - If the filelist or the replica is null.
        UnknownID - If the replica does not already exist in the database.
      • getDateOfLastWrongFilesUpdate

        public Date getDateOfLastWrongFilesUpdate​(Replica replica)
                                           throws ArgumentNotValid,
                                                  IllegalArgumentException
        Method for retrieving the date for the last update for corrupted files.

        This method does not contact the replicas, it only retrieves the data from the last time the checksum was retrieved.

        Specified by:
        getDateOfLastWrongFilesUpdate in interface BitPreservationDAO
        Parameters:
        replica - The replica to find the date for the latest update for corruption of files.
        Returns:
        The date for the last checksum update. A null is returned if no wrong files update has been performed for this replica.
        Throws:
        ArgumentNotValid - If the replica is null.
        IllegalArgumentException - If the Date of the Timestamp cannot be instantiated.
      • getNumberOfMissingFilesInLastUpdate

        public long getNumberOfMissingFilesInLastUpdate​(Replica replica)
                                                 throws ArgumentNotValid
        Method for retrieving the number of files missing from a specific replica.

        This method does not contact the replica directly, it only retrieves the count of missing files from the last filelist update.

        Specified by:
        getNumberOfMissingFilesInLastUpdate in interface BitPreservationDAO
        Parameters:
        replica - The replica to find the number of missing files for.
        Returns:
        The number of missing files for the replica.
        Throws:
        ArgumentNotValid - If the replica is null.
      • getMissingFilesInLastUpdate

        public Iterable<String> getMissingFilesInLastUpdate​(Replica replica)
                                                     throws ArgumentNotValid
        Method for retrieving the list of the names of the files which was missing for the replica in the last filelist update.

        This method does not contact the replica, it only uses the database to find the files, which was missing during the last filelist update.

        Specified by:
        getMissingFilesInLastUpdate in interface BitPreservationDAO
        Parameters:
        replica - The replica to find the list of missing files for.
        Returns:
        A list containing the names of the files which are missing in the given replica.
        Throws:
        ArgumentNotValid - If the replica is null.
      • getNumberOfWrongFilesInLastUpdate

        public long getNumberOfWrongFilesInLastUpdate​(Replica replica)
                                               throws ArgumentNotValid
        Method for retrieving the amount of files with a incorrect checksum within a replica.

        This method does not contact the replica, it only uses the database to count the amount of files which are corrupt.

        Specified by:
        getNumberOfWrongFilesInLastUpdate in interface BitPreservationDAO
        Parameters:
        replica - The replica to find the number of corrupted files for.
        Returns:
        The number of corrupted files.
        Throws:
        ArgumentNotValid - If the replica is null.
      • getWrongFilesInLastUpdate

        public Iterable<String> getWrongFilesInLastUpdate​(Replica replica)
                                                   throws ArgumentNotValid
        Method for retrieving the list of the files in the replica which have a incorrect checksum. E.g. the checksum_status is set to CORRUPT.

        This method does not contact the replica, it only uses the local database.

        Specified by:
        getWrongFilesInLastUpdate in interface BitPreservationDAO
        Parameters:
        replica - The replica to find the list of corrupted files for.
        Returns:
        The list of files which have wrong checksums.
        Throws:
        ArgumentNotValid - If the replica is null.
      • getNumberOfFiles

        public long getNumberOfFiles​(Replica replica)
                              throws ArgumentNotValid
        Method for retrieving the number of files within a replica. This count all the files which are not missing from the replica, thus all entries in the replicafileinfo table which has the filelist_status set to OK. It is ignored whether the files has a correct checksum.

        This method does not contact the replica, it only uses the local database.

        Specified by:
        getNumberOfFiles in interface BitPreservationDAO
        Parameters:
        replica - The replica to count the number of files for.
        Returns:
        The number of files within the replica.
        Throws:
        ArgumentNotValid - If the replica is null.
      • getBitarchiveWithGoodFile

        public Replica getBitarchiveWithGoodFile​(String filename)
                                          throws ArgumentNotValid
        Method for finding a replica with a valid version of a file. This method is used in order to find a replica from which a file should be retrieved, during the process of restoring a corrupt file on another replica.

        This replica must of the type bitarchive, since a file cannot be retrieved from a checksum replica.

        Specified by:
        getBitarchiveWithGoodFile in interface BitPreservationDAO
        Parameters:
        filename - The name of the file which needs to have a valid version in a bitarchive.
        Returns:
        A bitarchive which contains a valid version of the file, or null if no such bitarchive exists.
        Throws:
        ArgumentNotValid - If the filename is null or the empty string.
      • getBitarchiveWithGoodFile

        public Replica getBitarchiveWithGoodFile​(String filename,
                                                 Replica badReplica)
                                          throws ArgumentNotValid
        Method for finding a replica with a valid version of a file. This method is used in order to find a replica from which a file should be retrieved, during the process of restoring a corrupt file on another replica.

        This replica must of the type bitarchive, since a file cannot be retrieved from a checksum replica.

        Specified by:
        getBitarchiveWithGoodFile in interface BitPreservationDAO
        Parameters:
        filename - The name of the file which needs to have a valid version in a bitarchive.
        badReplica - The Replica which has a bad copy of the given file
        Returns:
        A bitarchive which contains a valid version of the file, or null if no such bitarchive exists (in which case, a notification is sent)
        Throws:
        ArgumentNotValid - If the replica is null or the filename is either null or the empty string.
      • updateChecksumInformationForFileOnReplica

        public void updateChecksumInformationForFileOnReplica​(String filename,
                                                              String checksum,
                                                              Replica replica)
                                                       throws ArgumentNotValid
        Method for updating a specific entry in the replicafileinfo table. Based on the filename, checksum and replica it is verified whether a file is missing, corrupt or valid.
        Specified by:
        updateChecksumInformationForFileOnReplica in interface BitPreservationDAO
        Parameters:
        filename - Name of the file.
        checksum - The checksum of the file. Is allowed to be null, if no file is found.
        replica - The replica where the file exists.
        Throws:
        ArgumentNotValid - If the filename is null or the empty string, or if the replica is null.
      • insertAdminEntry

        public boolean insertAdminEntry​(String line)
                                 throws ArgumentNotValid
        Method for inserting a line of Admin.Data into the database. It is assumed that it is a '0.4' admin.data line.
        Parameters:
        line - The line to insert into the database.
        Returns:
        Whether the line was valid.
        Throws:
        ArgumentNotValid - If the line is null. If it is empty, then it is logged.
      • setAdminDate

        public void setAdminDate​(Date date)
                          throws ArgumentNotValid
        Method for setting a specific value for the filelistdate and the checksumlistdate for all the replicas.
        Parameters:
        date - The new date for the checksumlist and filelist for all the replicas.
        Throws:
        ArgumentNotValid - If the date is null.
      • isEmpty

        public boolean isEmpty()
        Method for telling whether the database is empty. The database is empty if it does not contain any files.

        The database will not be entirely empty, since the replicas are put into the replica table during the instantiation of this class, but if the file table is empty, then the replicafileinfo table is also empty, and the database will be considered empty.

        Returns:
        Whether the file list is empty.
      • retrieveAsText

        public String retrieveAsText()
        Method to print all the tables in the database.
        Returns:
        all the tables as a text string