Package | Description |
---|---|
dk.netarkivet.harvester.indexserver | |
is.hi.bok.deduplicator |
Modifier and Type | Class and Description |
---|---|
class |
CDXOriginCrawlLogIterator
This subclass of CrawlLogIterator adds the layer of digging an origin of the form "arcfile,offset" out of a
corresponding CDX index.
|
Modifier and Type | Class and Description |
---|---|
class |
CrawlLogIterator
An implementation of a
CrawlDataIterator capable of iterating over a Heritrix's style
crawl.log . |
Modifier and Type | Method and Description |
---|---|
long |
DigestIndexer.writeToIndex(CrawlDataIterator dataIt,
String mimefilter,
boolean blacklist,
String defaultOrigin,
boolean verbose)
Writes the contents of a
CrawlDataIterator to this index. |
long |
DigestIndexer.writeToIndex(CrawlDataIterator dataIt,
String mimefilter,
boolean blacklist,
String defaultOrigin,
boolean verbose,
boolean skipDuplicates)
Writes the contents of a
CrawlDataIterator to this index. |
Copyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.