Class WaybackIndexer

  • All Implemented Interfaces:
    CleanupIF

    public class WaybackIndexer
    extends Object
    implements CleanupIF
    The WaybackIndexer starts threads to find new files to be indexed and indexes them.

    There is 1 producer thread which runs as a timer thread, for example once a day, and runs first a FileNameHarvester to get a list of all files in the archive after which it fills the indexer queue with any new files found.

    Simultaneously there is a family of consumer threads which wait for the queue to be populated and take elements from it and index them.

    • Method Detail

      • getInstance

        public static WaybackIndexer getInstance()
        Factory method which creates a singleton wayback indexer and sets it running. It has the side effect of creating the output directories for the indexer if these do not already exist. It also reads files for the initial ingest if necessary.
        Returns:
        the indexer.
      • cleanup

        public void cleanup()
        Performs any necessary cleanup functions. These include cleaning any partial batch output from the temporary batch output file and closing the hibernate session factory.
        Specified by:
        cleanup in interface CleanupIF