dk.netarkivet.harvester.datamodel
Class DomainConfiguration

java.lang.Object
  extended by dk.netarkivet.harvester.datamodel.DomainConfiguration
All Implemented Interfaces:
Named

public class DomainConfiguration
extends java.lang.Object
implements Named

This class describes a configuration for harvesting a domain. It combines a number of seedlists, a number of passwords, an order template, and some specialised settings to define the way to harvest a domain.


Constructor Summary
DomainConfiguration(java.lang.String theConfigName, Domain domain, java.util.List<SeedList> seedlists, java.util.List<Password> passwords)
          Create a new configuration for a domain.
 
Method Summary
 void addHarvestInfo(HarvestInfo hi)
          Adds harvest information to the configurations history.
 void addPassword(Password password)
          Add password.
 void addSeedList(SeedList seedlist)
          Add a new seedlist to the configuration.
 java.lang.String getComments()
          Returns comments.
 Domain getDomain()
          Returns the domain aggregating this configuration.
 long getExpectedNumberOfObjects(long objectLimit, long byteLimit)
          Gets the best expectation for how many objects a harvest using this configuration will retrieve, given a job with a maximum limit pr.
(package private)  long getID()
          Get the ID of this configuration.
 long getMaxBytes()
          Returns the maximum number of bytes to download during a single harvest of a domain.
 long getMaxObjects()
          Returns the maximum number of objects to harvest from the domain.
 int getMaxRequestRate()
          Returns the maximum request rate to use when harvesting the domain.
 java.lang.String getName()
          Get the configuration name.
 java.lang.String getOrderXmlName()
          Returns the name of the order xml file used by the domain.
 java.util.Iterator<Password> getPasswords()
          Get an iterator of passwords used in this configuration.
 java.util.Iterator<SeedList> getSeedLists()
          Get an iterator of seedlists used in this configuration.
(package private)  boolean hasID()
          Check if this configuration has an ID set yet (doesn't happen until the DBDAO persists it).
 long minObjectsBytesLimit(long objectLimit, long byteLimit, long expectedObjectSize)
          Return the lowest limit for the two values, or MAX_DOMAIN_SIZE if both are infinite, which is the max size we harvest from this domain.
 void removePassword(java.lang.String passwordName)
          Remove a password from the list of passwords used in this domain.
 void setComments(java.lang.String comments)
          Set the comments field.
(package private)  void setID(long id)
          Set the ID of this configuration.
 void setMaxBytes(long maxBytes)
          Specify the maximum number of bytes to download from a domain in a single harvest.
 void setMaxObjects(long max)
          Specify the maximum number of objects to retrieve from the domain.
 void setMaxRequestRate(int maxrate)
          Specify the maximum request rate to use when harvesting data.
 void setOrderXmlName(java.lang.String ordername)
          Specify the name of the order.xml template to use.
 void setPasswords(java.util.List<Password> passwords)
          Sets the used passwords to the given list.
 void setSeedLists(java.util.List<SeedList> seedlists)
          Sets the used seedlists to the given list.
 java.lang.String toString()
          ToString of DomainConfiguration class.
 boolean usesPassword(java.lang.String passwordName)
          Check whether this domain uses a given password.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DomainConfiguration

public DomainConfiguration(java.lang.String theConfigName,
                           Domain domain,
                           java.util.List<SeedList> seedlists,
                           java.util.List<Password> passwords)
Create a new configuration for a domain.

Parameters:
theConfigName - The name of this configuration
domain - The domain thet this configuration is for
seedlists - Seedlists to use in this configuration.
passwords - Passwords to use in this configuration.
Method Detail

setOrderXmlName

public void setOrderXmlName(java.lang.String ordername)
Specify the name of the order.xml template to use.

Parameters:
ordername - order.xml template name
Throws:
ArgumentNotValid - if filename null or empty

setMaxObjects

public void setMaxObjects(long max)
Specify the maximum number of objects to retrieve from the domain.

Parameters:
max - maximum number of objects to retrieve
Throws:
ArgumentNotValid - if max<-1

setMaxRequestRate

public void setMaxRequestRate(int maxrate)
Specify the maximum request rate to use when harvesting data.

Parameters:
maxrate - the maximum request rate
Throws:
ArgumentNotValid - if maxrate<0

setMaxBytes

public void setMaxBytes(long maxBytes)
Specify the maximum number of bytes to download from a domain in a single harvest.

Parameters:
maxBytes - Maximum number of bytes to download, or -1 for no limit.
Throws:
ArgumentNotValid - if maxBytes < -1

getName

public java.lang.String getName()
Get the configuration name.

Specified by:
getName in interface Named
Returns:
the configuration name

getComments

public java.lang.String getComments()
Returns comments.

Specified by:
getComments in interface Named
Returns:
string containing comments

getOrderXmlName

public java.lang.String getOrderXmlName()
Returns the name of the order xml file used by the domain.

Returns:
name of the order.xml file that should be used when harvesting the domain

getMaxObjects

public long getMaxObjects()
Returns the maximum number of objects to harvest from the domain.

Returns:
maximum number of objects to harvest

getMaxRequestRate

public int getMaxRequestRate()
Returns the maximum request rate to use when harvesting the domain.

Returns:
maximum request rate

getMaxBytes

public long getMaxBytes()
Returns the maximum number of bytes to download during a single harvest of a domain.

Returns:
Maximum bytes limit, or -1 for no limit.

getDomain

public Domain getDomain()
Returns the domain aggregating this configuration.

Returns:
the Domain aggregating this configuration.

addHarvestInfo

public void addHarvestInfo(HarvestInfo hi)
Adds harvest information to the configurations history.

Parameters:
hi - HarvestInfo to add to Domain.

getSeedLists

public java.util.Iterator<SeedList> getSeedLists()
Get an iterator of seedlists used in this configuration.

Returns:
seedlists as iterator

addSeedList

public void addSeedList(SeedList seedlist)
Add a new seedlist to the configuration.

Parameters:
seedlist - the seedlist to add
Throws:
ArgumentNotValid - if the seedlist is null
UnknownID - if the seedlist is not defined on the domain
PermissionDenied - if the seedlist is different from the one on the domain.

getPasswords

public java.util.Iterator<Password> getPasswords()
Get an iterator of passwords used in this configuration.

Returns:
The passwords in an iterator

addPassword

public void addPassword(Password password)
Add password.

Parameters:
password - to add

getExpectedNumberOfObjects

public long getExpectedNumberOfObjects(long objectLimit,
                                       long byteLimit)
Gets the best expectation for how many objects a harvest using this configuration will retrieve, given a job with a maximum limit pr. domain

Parameters:
objectLimit - The maximum limit, or Constants.HERITRIX_MAXOBJECTS_INFINITY for no limit. This limit overrides the limit set on the configuration, unless override is in effect.
byteLimit - The maximum number of bytes that will be used as limit in the harvest. This limit overrides the limit set on the configuration, unless override is in effect.
Returns:
The expected number of objects.

minObjectsBytesLimit

public long minObjectsBytesLimit(long objectLimit,
                                 long byteLimit,
                                 long expectedObjectSize)
Return the lowest limit for the two values, or MAX_DOMAIN_SIZE if both are infinite, which is the max size we harvest from this domain.

Parameters:
objectLimit - A long value defining an object limit, or 0 for infinite
byteLimit - A long value defining a byte limit, or HarvesterSettings.MAX_DOMAIN_SIZE for infinite.
expectedObjectSize - The expected number of bytes per object
Returns:
The lowest of the two boundaries, or MAX_DOMAIN_SIZE if both are unlimited.

setComments

public void setComments(java.lang.String comments)
Set the comments field.

Parameters:
comments - User-entered free-form comments.

removePassword

public void removePassword(java.lang.String passwordName)
Remove a password from the list of passwords used in this domain.

Parameters:
passwordName - Password to Remove.

usesPassword

public boolean usesPassword(java.lang.String passwordName)
Check whether this domain uses a given password.

Parameters:
passwordName - The given password
Returns:
whether the given password is used

setSeedLists

public void setSeedLists(java.util.List<SeedList> seedlists)
Sets the used seedlists to the given list. Note: list is copied.

Parameters:
seedlists - The seedlists to use.
Throws:
ArgumentNotValid - if the seedslists are null

setPasswords

public void setPasswords(java.util.List<Password> passwords)
Sets the used passwords to the given list. Note: list is copied.

Parameters:
passwords - The passwords to use.
Throws:
ArgumentNotValid - if the passwords are null

getID

long getID()
Get the ID of this configuration. Only for use by DBDAO

Returns:
the ID of this configuration

setID

void setID(long id)
Set the ID of this configuration. Only for use by DBDAO

Parameters:
id - use this id for this configuration

hasID

boolean hasID()
Check if this configuration has an ID set yet (doesn't happen until the DBDAO persists it).

Returns:
true, if the configuration has an ID

toString

public java.lang.String toString()
ToString of DomainConfiguration class.

Overrides:
toString in class java.lang.Object
Returns:
a string with info about the instance of this class.