dk.netarkivet.harvester.harvesting.distribute
Class DomainHarvestReport

java.lang.Object
  extended by dk.netarkivet.harvester.harvesting.distribute.DomainHarvestReport
All Implemented Interfaces:
java.io.Serializable
Direct Known Subclasses:
HeritrixDomainHarvestReport

public abstract class DomainHarvestReport
extends java.lang.Object
implements java.io.Serializable

Interface to define what kind of statistics, all crawlers are supposed to deliver to this system.

See Also:
Serialized Form

Field Summary
protected  java.util.Map<java.lang.String,DomainStats> domainstats
          Datastructure holding the domain-information contained in one harvest.
 
Constructor Summary
DomainHarvestReport()
          Default constructor that does nothing.
 
Method Summary
 java.lang.Long getByteCount(java.lang.String domainName)
          Get the number of bytes downloaded for the given domain.
 java.util.Set<java.lang.String> getDomainNames()
          Returns the set of domain names that are contained in hosts-report.txt (i.e.
 java.lang.Long getObjectCount(java.lang.String domainName)
          Get the number of objects found for the given domain.
 StopReason getStopReason(java.lang.String domainName)
          Get the StopReason for the given domain.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

domainstats

protected final java.util.Map<java.lang.String,DomainStats> domainstats
Datastructure holding the domain-information contained in one harvest.

Constructor Detail

DomainHarvestReport

public DomainHarvestReport()
Default constructor that does nothing. The real construction is supposed to be done in the subclasses by filling out the domainStats map with crawl results.

Method Detail

getDomainNames

public final java.util.Set<java.lang.String> getDomainNames()
Returns the set of domain names that are contained in hosts-report.txt (i.e. host names mapped to domains)

Returns:
a Set of Strings

getObjectCount

public final java.lang.Long getObjectCount(java.lang.String domainName)
Get the number of objects found for the given domain.

Parameters:
domainName - A domain name (as given by getDomainNames())
Returns:
How many objects were collected for that domain
Throws:
ArgumentNotValid - if null or empty domainName

getByteCount

public final java.lang.Long getByteCount(java.lang.String domainName)
Get the number of bytes downloaded for the given domain.

Parameters:
domainName - A domain name (as given by getDomainNames())
Returns:
How many bytes were collected for that domain
Throws:
ArgumentNotValid - if null or empty domainName

getStopReason

public final StopReason getStopReason(java.lang.String domainName)
Get the StopReason for the given domain.

Parameters:
domainName - A domain name (as given by getDomainNames())
Returns:
the StopReason for the given domain.
Throws:
ArgumentNotValid - if null or empty domainName