Interface | Description |
---|---|
DedupAttributeConstants |
Lifted from H1 AdaptiveRevisitAttributeConstants and limited to what DeDuplicator was using.
|
Class | Description |
---|---|
ArchiveDateConverter |
TODO merge with dk.netarkivet.common.utils.archive.ArchiveDateConverter
|
CommandLineParser |
Print DigestIndexer command-line usage message.
|
CrawlDataItem |
A base class for individual items of crawl data that should be added to the index.
|
CrawlDataIterator |
An abstract base class for implementations of iterators that iterate over different sets of crawl data (i.e.
|
CrawlLogIterator |
An implementation of a
CrawlDataIterator capable of iterating over a Heritrix's style
crawl.log . |
DeDuplicator |
Heritrix compatible processor.
|
DigestIndexer |
A class for building a de-duplication index.
|
Enum | Description |
---|---|
DeDuplicator.AnalysisMode | |
DeDuplicator.FilterMode | |
DeDuplicator.MatchingMethod | |
DeDuplicator.OriginHandling |
Copyright © 2005–2018 The Royal Danish Library, the National Library of France and the Austrian National Library.. All rights reserved.