Interface | Description |
---|---|
DeduplicateToCDXAdapterInterface |
Interface describing a class which can be used to convert duplicate records in a crawl log to wayback-compatible cdx
records.
|
Class | Description |
---|---|
DeduplicateToCDXAdapter |
Class containing methods for turning duplicate entries in a crawl log into lines in a CDX index file.
|
DeduplicationCDXExtractionBatchJob |
This batch batch job takes deduplication records from a crawl log in a metadata arcfile and converts them to cdx
records for use in wayback.
|
UrlCanonicalizerFactory |
A factory for returning a UrlCanonicalizer.
|
WaybackCDXExtractionARCBatchJob |
Returns a cdx file using the appropriate format for wayback, including canonicalisation of urls.
|
WaybackCDXExtractionWARCBatchJob |
Returns a cdx file using the appropriate format for wayback, including canonicalisation of urls.
|
Copyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.