Package | Description |
---|---|
dk.netarkivet.common.utils.archive | |
dk.netarkivet.common.utils.cdx | |
dk.netarkivet.viewerproxy.webinterface | |
dk.netarkivet.wayback.batch |
Modifier and Type | Class and Description |
---|---|
class |
ArchiveBatchJob
Abstract class defining a batch job to run on a set of ARC/WARC files.
|
class |
GetMetadataArchiveBatchJob
A batch job that extracts metadata.
|
Modifier and Type | Class and Description |
---|---|
class |
ArchiveExtractCDXJob
Batch job that extracts information to create a CDX file.
|
Modifier and Type | Class and Description |
---|---|
class |
CrawlLogLinesMatchingRegexp
Batchjob that extracts lines from a crawl log matching a regular expression The batch job should be restricted to run
on metadata files for a specific job only, using the
FileBatchJob.processOnlyFilesMatching(String) construct. |
class |
HarvestedUrlsForDomainBatchJob
Batchjob that extracts lines referring to a specific domain from a crawl log.
|
Modifier and Type | Class and Description |
---|---|
class |
DeduplicationCDXExtractionBatchJob
This batch batch job takes deduplication records from a crawl log in a metadata arcfile and converts them to cdx
records for use in wayback.
|
Copyright © 2005–2015 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.