|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object dk.netarkivet.harvester.datamodel.DomainConfiguration
public class DomainConfiguration
This class describes a configuration for harvesting a domain. It combines a number of seedlists, a number of passwords, an order template, and some specialised settings to define the way to harvest a domain.
Constructor Summary | |
---|---|
DomainConfiguration(java.lang.String theConfigName,
Domain domain,
java.util.List<SeedList> seedlists,
java.util.List<Password> passwords)
Create a new configuration for a domain. |
|
DomainConfiguration(java.lang.String theConfigName,
java.lang.String domainName,
DomainHistory history,
java.util.List<java.lang.String> crawlertraps,
java.util.List<SeedList> seedlists,
java.util.List<Password> passwords)
Alternate constructor. |
Method Summary | |
---|---|
void |
addPassword(Domain domain,
Password password)
Add password to the configuration. |
void |
addSeedList(Domain domain,
SeedList seedlist)
Add a new seedlist to the configuration. |
java.lang.String |
getComments()
Returns comments. |
java.util.List<java.lang.String> |
getCrawlertraps()
|
DomainHistory |
getDomainhistory()
|
java.lang.String |
getDomainName()
Returns the name of the domain aggregating this configuration. |
long |
getExpectedNumberOfObjects(long objectLimit,
long byteLimit)
Gets the best expectation for how many objects a harvest using this configuration will retrieve, given a job with a maximum limit pr. |
(package private) long |
getID()
Get the ID of this configuration. |
long |
getMaxBytes()
Returns the maximum number of bytes to download during a single harvest of a domain. |
long |
getMaxObjects()
Returns the maximum number of objects to harvest from the domain. |
int |
getMaxRequestRate()
Returns the maximum request rate to use when harvesting the domain. |
java.lang.String |
getName()
Get the configuration name. |
java.lang.String |
getOrderXmlName()
Returns the name of the order xml file used by the domain. |
java.util.Iterator<Password> |
getPasswords()
Get an iterator of passwords used in this configuration. |
java.util.Iterator<SeedList> |
getSeedLists()
Get an iterator of seedlists used in this configuration. |
(package private) boolean |
hasID()
Check if this configuration has an ID set yet (doesn't happen until the DBDAO persists it). |
long |
minObjectsBytesLimit(long objectLimit,
long byteLimit,
long expectedObjectSize)
Return the lowest limit for the two values, or MAX_DOMAIN_SIZE if both are infinite, which is the max size we harvest from this domain. |
void |
removePassword(java.lang.String passwordName)
Remove a password from the list of passwords used in this domain. |
void |
setComments(java.lang.String comments)
Set the comments field. |
void |
setCrawlertraps(java.util.List<java.lang.String> someCrawlertraps)
Set the crawlerltraps for this configuration. |
void |
setDomainhistory(DomainHistory newDomainhistory)
Set the domainHistory for this configuration. |
(package private) void |
setID(long anId)
Set the ID of this configuration. |
void |
setMaxBytes(long maxBytes)
Specify the maximum number of bytes to download from a domain in a single harvest. |
void |
setMaxObjects(long max)
Specify the maximum number of objects to retrieve from the domain. |
void |
setMaxRequestRate(int maxrate)
Specify the maximum request rate to use when harvesting data. |
void |
setOrderXmlName(java.lang.String ordername)
Specify the name of the order.xml template to use. |
void |
setPasswords(Domain domain,
java.util.List<Password> newPasswords)
Sets the used passwords to the given list. |
void |
setSeedLists(Domain domain,
java.util.List<SeedList> newSeedlists)
Sets the used seedlists to the given list. |
java.lang.String |
toString()
ToString of DomainConfiguration class. |
boolean |
usesPassword(java.lang.String passwordName)
Check whether this domain uses a given password. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public DomainConfiguration(java.lang.String theConfigName, Domain domain, java.util.List<SeedList> seedlists, java.util.List<Password> passwords)
theConfigName
- The name of this configurationdomain
- The domain that this configuration is forseedlists
- Seedlists to use in this configuration.passwords
- Passwords to use in this configuration.public DomainConfiguration(java.lang.String theConfigName, java.lang.String domainName, DomainHistory history, java.util.List<java.lang.String> crawlertraps, java.util.List<SeedList> seedlists, java.util.List<Password> passwords)
theConfigName
- theConfigName The name of this configurationdomainName
- The name of the domain that this configuration is forhistory
- The domainhistory belonging the given domaincrawlertraps
- The domainhistory belonging the given domainseedlists
- Seedlists to use in this configurationpasswords
- Passwords to use in this configuration.Method Detail |
---|
public void setOrderXmlName(java.lang.String ordername)
ordername
- order.xml template name
ArgumentNotValid
- if filename null or emptypublic void setMaxObjects(long max)
max
- maximum number of objects to retrieve
ArgumentNotValid
- if max<-1public void setMaxRequestRate(int maxrate)
maxrate
- the maximum request rate
ArgumentNotValid
- if maxrate<0public void setMaxBytes(long maxBytes)
maxBytes
- Maximum number of bytes to download, or -1 for no limit.
ArgumentNotValid
- if maxBytes < -1public java.lang.String getName()
getName
in interface Named
public java.lang.String getComments()
getComments
in interface Named
public java.lang.String getOrderXmlName()
public long getMaxObjects()
public int getMaxRequestRate()
public long getMaxBytes()
public java.lang.String getDomainName()
public java.util.Iterator<SeedList> getSeedLists()
public void addSeedList(Domain domain, SeedList seedlist)
seedlist
- the seedlist to adddomain
- The domain to check if the seedlist exists
ArgumentNotValid
- if the seedlist is null
UnknownID
- if the seedlist is not defined on the domain
PermissionDenied
- if the seedlist is different from the one
on the domain.public java.util.Iterator<Password> getPasswords()
public void addPassword(Domain domain, Password password)
password
- to add (must exist in the domain)domain
- the domain where the password should come from.public long getExpectedNumberOfObjects(long objectLimit, long byteLimit)
objectLimit
- The maximum limit, or
Constants.HERITRIX_MAXOBJECTS_INFINITY for no limit. This limit overrides
the limit set on the configuration, unless override is in effect.byteLimit
- The maximum number of bytes that will be used as
limit in the harvest. This limit overrides the limit set on the
configuration, unless override is in effect.
public long minObjectsBytesLimit(long objectLimit, long byteLimit, long expectedObjectSize)
objectLimit
- A long value defining an object limit, or 0 for
infinitebyteLimit
- A long value defining a byte limit, or
HarvesterSettings.MAX_DOMAIN_SIZE for infinite.expectedObjectSize
- The expected number of bytes per object
public void setComments(java.lang.String comments)
comments
- User-entered free-form comments.public void removePassword(java.lang.String passwordName)
passwordName
- Password to Remove.public boolean usesPassword(java.lang.String passwordName)
passwordName
- The given password
public void setSeedLists(Domain domain, java.util.List<SeedList> newSeedlists)
newSeedlists
- The seedlists to use.domain
- The domain where the seedlists should come from
ArgumentNotValid
- if the seedslists are nullpublic void setPasswords(Domain domain, java.util.List<Password> newPasswords)
newPasswords
- The passwords to use.domain
- The domain where the passwords should come from
ArgumentNotValid
- if the passwords are nulllong getID()
void setID(long anId)
anId
- use this id for this configurationboolean hasID()
public java.lang.String toString()
toString
in class java.lang.Object
public void setCrawlertraps(java.util.List<java.lang.String> someCrawlertraps)
someCrawlertraps
- a list of crawlertrapspublic java.util.List<java.lang.String> getCrawlertraps()
public DomainHistory getDomainhistory()
public void setDomainhistory(DomainHistory newDomainhistory)
newDomainhistory
- the new domainHistory for this configuration(
null is accepted for no History)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |