Class Reporting
- java.lang.Object
-
- dk.netarkivet.viewerproxy.webinterface.Reporting
-
public class Reporting extends Object
Methods for generating the batch results needed by the QA pages.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static File
getCrawlLogForDomainInJob(String domain, long jobid)
Submit a batch job to extract the part of a crawl log that is associated with the given domain and job.static File
getCrawlLogLinesMatchingDomain(long jobID, String domain)
static File
getCrawlLoglinesMatchingRegexp(long jobid, String regexp)
Return any crawllog lines for a given jobid matching the given regular expression.static List<String>
getFilesForJob(long jobid, String harvestprefix)
Retrieve a list of all files uploaded for a given harvest job.static List<CDXRecord>
getMetadataCDXRecordsForJob(long jobid)
Depending on settings, submits either a Hadoop job or batch job to generate cdx for all metadata files for a job, and returns the results in a list.
-
-
-
Method Detail
-
getFilesForJob
public static List<String> getFilesForJob(long jobid, String harvestprefix)
Retrieve a list of all files uploaded for a given harvest job. For installations that use batch, this is done via a batch job, and for hadoop-based implementations it is done via an implementation of dk.netarkivet.common.utils.service.FileResolver- Parameters:
jobid
- the job for which files are requiredharvestprefix
- the prefix for the (w)arc datafiles for this job as determined by the implementation of ArchiveFileNaming used in the installation- Returns:
- a list of filenames
-
getMetadataCDXRecordsForJob
public static List<CDXRecord> getMetadataCDXRecordsForJob(long jobid)
Depending on settings, submits either a Hadoop job or batch job to generate cdx for all metadata files for a job, and returns the results in a list.- Parameters:
jobid
- The job to get cdx for.- Returns:
- A list of cdx records.
- Throws:
ArgumentNotValid
- If jobid is 0 or negative.
-
getCrawlLogForDomainInJob
public static File getCrawlLogForDomainInJob(String domain, long jobid)
Submit a batch job to extract the part of a crawl log that is associated with the given domain and job.- Parameters:
domain
- The domain to get crawl.log-lines for.jobid
- The jobid to get the crawl.log-lines for.- Returns:
- A file containing the crawl.log lines. This file is temporary, and should be deleted after use.
- Throws:
ArgumentNotValid
- On negative jobids, or if domain is null or the empty string.
-
getCrawlLoglinesMatchingRegexp
public static File getCrawlLoglinesMatchingRegexp(long jobid, String regexp)
Return any crawllog lines for a given jobid matching the given regular expression.- Parameters:
jobid
- The jobidregexp
- A regular expression- Returns:
- a File with the matching lines.
-
-