Package dk.netarkivet.archive.indexserver

Interface Summary
RawDataCache An interface for getting raw data out of the bitarchives based on job IDs.
 

Class Summary
CDXDataCache A RawDataCache that serves files with CDX data.
CDXIndexCache A cache that serves CDX index files for job IDs.
CDXOriginCrawlLogIterator This subclass of CrawlLogIterator adds the layer of digging an origin of the form "arcfile,offset" out of a corresponding CDX index.
CombiningMultiFileBasedCache<T extends Comparable<T>> This class provides the framework for classes that cache the effort of combining multiple files into one.
CrawlLogDataCache This class implements the low-level cache for crawl log Lucene indexing.
CrawlLogIndexCache A cache that serves Lucene indices of crawl logs for given job IDs.
DedupCrawlLogIndexCache A cache of crawl log indices appropriate for the Icelandic deduplicator code, excluding all text entries.
DigestIndexerWorker This worker class handles the indexing of one single crawl-log and associated cdxfile.
DigestOptions Encapsulates the options for the indexing process.
FileBasedCache<I> A generic cache that stores items in files.
FullCrawlLogIndexCache A CrawlLogIndexCache that takes in all entries in the crawl log.
IndexingState Stores the state of a indexing task.
IndexServer Index server.
IndexServerApplication This class is used to start the IndexServer application.
MultiFileBasedCache<T extends Comparable<T>> Implementation of file based cache, that works with the assumption we are working on a set if ids, of which we might only get a subset correct.
RawMetadataCache This is an implementation of the RawDataCache specialized for data out of metadata files.