Class AbstractHarvestReport
- java.lang.Object
-
- dk.netarkivet.harvester.harvesting.report.AbstractHarvestReport
-
- All Implemented Interfaces:
HarvestReport
,java.io.Serializable
- Direct Known Subclasses:
BnfHarvestReport
,LegacyHarvestReport
public abstract class AbstractHarvestReport extends java.lang.Object implements HarvestReport
Base implementation for a harvest report.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description AbstractHarvestReport()
Default constructor that does nothing.AbstractHarvestReport(DomainStatsReport dsr)
Constructor from DomainStatsReports.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.Long
getByteCount(java.lang.String domainName)
Get the number of bytes downloaded for the given domain.StopReason
getDefaultStopReason()
Returns the default stop reason initially assigned to every domain.java.util.Set<java.lang.String>
getDomainNames()
Returns the set of domain names that are contained in hosts-report.txt (i.e.java.lang.Long
getObjectCount(java.lang.String domainName)
Get the number of objects found for the given domain.protected DomainStats
getOrCreateDomainStats(java.lang.String domainName)
Attempts to get an already existingDomainStats
object for that domain, and if not found creates one with zero values.StopReason
getStopReason(java.lang.String domainName)
Get the StopReason for the given domain.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface dk.netarkivet.harvester.harvesting.report.HarvestReport
postProcess
-
-
-
-
Constructor Detail
-
AbstractHarvestReport
public AbstractHarvestReport()
Default constructor that does nothing. The real construction is supposed to be done in the subclasses by filling out the domainStats map with crawl results.
-
AbstractHarvestReport
public AbstractHarvestReport(DomainStatsReport dsr)
Constructor from DomainStatsReports.- Parameters:
dsr
- the result of parsing the crawl.log for domain statistics
-
-
Method Detail
-
getDefaultStopReason
public StopReason getDefaultStopReason()
Description copied from interface:HarvestReport
Returns the default stop reason initially assigned to every domain.- Specified by:
getDefaultStopReason
in interfaceHarvestReport
-
getDomainNames
public final java.util.Set<java.lang.String> getDomainNames()
Returns the set of domain names that are contained in hosts-report.txt (i.e. host names mapped to domains)- Specified by:
getDomainNames
in interfaceHarvestReport
- Returns:
- a Set of Strings
-
getObjectCount
public final java.lang.Long getObjectCount(java.lang.String domainName)
Get the number of objects found for the given domain.- Specified by:
getObjectCount
in interfaceHarvestReport
- Parameters:
domainName
- A domain name (as given by getDomainNames())- Returns:
- How many objects were collected for that domain or Null if none found
-
getByteCount
public final java.lang.Long getByteCount(java.lang.String domainName)
Get the number of bytes downloaded for the given domain.- Specified by:
getByteCount
in interfaceHarvestReport
- Parameters:
domainName
- A domain name (as given by getDomainNames())- Returns:
- How many bytes were collected for that domain or null if information available for this domain.
-
getStopReason
public final StopReason getStopReason(java.lang.String domainName)
Get the StopReason for the given domain.- Specified by:
getStopReason
in interfaceHarvestReport
- Parameters:
domainName
- A domain name (as given by getDomainNames())- Returns:
- the StopReason for the given domain or null, if no stopreason found for this domain
- Throws:
ArgumentNotValid
- if null or empty domainName
-
getOrCreateDomainStats
protected DomainStats getOrCreateDomainStats(java.lang.String domainName)
Attempts to get an already existingDomainStats
object for that domain, and if not found creates one with zero values.- Parameters:
domainName
- the name of the domain to get DomainStats for.- Returns:
- a DomainStats object for the given domain-name.
-
-