public class DomainConfiguration extends Object implements Named
Constructor and Description |
---|
DomainConfiguration(String theConfigName,
Domain domain,
List<SeedList> seedlists,
List<Password> passwords)
Create a new configuration for a domain.
|
DomainConfiguration(String theConfigName,
String domainName,
DomainHistory history,
List<String> crawlertraps,
List<SeedList> seedlists,
List<Password> passwords)
Alternate constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
addPassword(Domain domain,
Password password)
Add password to the configuration.
|
void |
addSeedList(Domain domain,
SeedList seedlist)
Add a new seedlist to the configuration.
|
static String |
cfgToString(DomainConfiguration cfg) |
List<EAV.AttributeAndType> |
getAttributesAndTypes()
Get this configurations EAV attributes and attribute types.
|
String |
getComments()
Returns comments.
|
List<String> |
getCrawlertraps() |
DomainHistory |
getDomainhistory() |
String |
getDomainName()
Returns the name of the domain aggregating this configuration.
|
long |
getExpectedNumberOfObjects(long objectLimit,
long byteLimit)
Gets the best expectation for how many objects a harvest using this configuration will retrieve, given a job with
a maximum limit pr.
|
Long |
getID()
Get the ID of this configuration.
|
long |
getMaxBytes()
Returns the maximum number of bytes to download during a single harvest of a domain.
|
long |
getMaxObjects()
Returns the maximum number of objects to harvest from the domain.
|
int |
getMaxRequestRate()
Returns the maximum request rate to use when harvesting the domain.
|
String |
getName()
Get the configuration name.
|
String |
getOrderXmlName()
Returns the name of the order xml file used by the domain.
|
Iterator<Password> |
getPasswords()
Get an iterator of passwords used in this configuration.
|
Iterator<SeedList> |
getSeedLists()
Get an iterator of seedlists used in this configuration.
|
long |
minObjectsBytesLimit(long objectLimit,
long byteLimit,
long expectedObjectSize)
Return the lowest limit for the two values, or MAX_DOMAIN_SIZE if both are infinite, which is the max size we
harvest from this domain.
|
void |
removePassword(String passwordName)
Remove a password from the list of passwords used in this domain.
|
void |
setAttributesAndTypes(List<EAV.AttributeAndType> attributesAndTypes)
Set this configurations EAV attributes and attribute types.
|
void |
setComments(String comments)
Set the comments field.
|
void |
setCrawlertraps(List<String> someCrawlertraps)
Set the crawlerltraps for this configuration.
|
void |
setDomainhistory(DomainHistory newDomainhistory)
Set the domainHistory for this configuration.
|
void |
setMaxBytes(long maxBytes)
Specify the maximum number of bytes to download from a domain in a single harvest.
|
void |
setMaxObjects(long max)
Specify the maximum number of objects to retrieve from the domain.
|
void |
setMaxRequestRate(int maxrate)
Specify the maximum request rate to use when harvesting data.
|
void |
setName(String configName)
Change the name of configuration to the given configName.
|
void |
setOrderXmlName(String ordername)
Specify the name of the order.xml template to use.
|
void |
setPasswords(Domain domain,
List<Password> newPasswords)
Sets the used passwords to the given list.
|
void |
setSeedLists(Domain domain,
List<SeedList> newSeedlists)
Sets the used seedlists to the given list.
|
String |
toString()
ToString of DomainConfiguration class.
|
boolean |
usesPassword(String passwordName)
Check whether this domain uses a given password.
|
public DomainConfiguration(String theConfigName, Domain domain, List<SeedList> seedlists, List<Password> passwords)
theConfigName
- The name of this configurationdomain
- The domain that this configuration is forseedlists
- Seedlists to use in this configuration.passwords
- Passwords to use in this configuration.public DomainConfiguration(String theConfigName, String domainName, DomainHistory history, List<String> crawlertraps, List<SeedList> seedlists, List<Password> passwords)
theConfigName
- theConfigName The name of this configurationdomainName
- The name of the domain that this configuration is forhistory
- The domainhistory of the given domaincrawlertraps
- The crawlertraps of the given domainseedlists
- Seedlists to use in this configurationpasswords
- Passwords to use in this configuration.public static String cfgToString(DomainConfiguration cfg)
public void setOrderXmlName(String ordername)
ordername
- order.xml template nameArgumentNotValid
- if filename null or emptypublic void setMaxObjects(long max)
max
- maximum number of objects to retrieveArgumentNotValid
- if max<-1public void setMaxRequestRate(int maxrate)
maxrate
- the maximum request rateArgumentNotValid
- if maxrate<0public void setMaxBytes(long maxBytes)
maxBytes
- Maximum number of bytes to download, or -1 for no limit.ArgumentNotValid
- if maxBytes < -1public String getComments()
getComments
in interface Named
public String getOrderXmlName()
public long getMaxObjects()
public int getMaxRequestRate()
public long getMaxBytes()
public String getDomainName()
public Iterator<SeedList> getSeedLists()
public void addSeedList(Domain domain, SeedList seedlist)
seedlist
- the seedlist to adddomain
- The domain to check if the seedlist existsArgumentNotValid
- if the seedlist is nullUnknownID
- if the seedlist is not defined on the domainPermissionDenied
- if the seedlist is different from the one on the domain.public void setSeedLists(Domain domain, List<SeedList> newSeedlists)
newSeedlists
- The seedlists to use.domain
- The domain where the seedlists should come fromArgumentNotValid
- if the seedslists are nullpublic Iterator<Password> getPasswords()
public void addPassword(Domain domain, Password password)
password
- to add (must exist in the domain)domain
- the domain where the password should come from.public long getExpectedNumberOfObjects(long objectLimit, long byteLimit)
objectLimit
- The maximum limit, or Constants.HERITRIX_MAXOBJECTS_INFINITY for no limit. This limit
overrides the limit set on the configuration, unless override is in effect.byteLimit
- The maximum number of bytes that will be used as limit in the harvest. This limit overrides the
limit set on the configuration, unless override is in effect.public long minObjectsBytesLimit(long objectLimit, long byteLimit, long expectedObjectSize)
objectLimit
- A long value defining an object limit, or 0 for infinitebyteLimit
- A long value defining a byte limit, or HarvesterSettings.MAX_DOMAIN_SIZE for infinite.expectedObjectSize
- The expected number of bytes per objectpublic void setComments(String comments)
comments
- User-entered free-form comments.public void removePassword(String passwordName)
passwordName
- Password to Remove.public boolean usesPassword(String passwordName)
passwordName
- The given passwordpublic void setPasswords(Domain domain, List<Password> newPasswords)
newPasswords
- The passwords to use.domain
- The domain where the passwords should come fromArgumentNotValid
- if the passwords are nullpublic void setCrawlertraps(List<String> someCrawlertraps)
someCrawlertraps
- a list of crawlertrapspublic List<String> getCrawlertraps()
public DomainHistory getDomainhistory()
public void setDomainhistory(DomainHistory newDomainhistory)
newDomainhistory
- the new domainHistory for this configuration( null is accepted for no History)public void setName(String configName)
configName
- a new name for this configuration.public List<EAV.AttributeAndType> getAttributesAndTypes()
public void setAttributesAndTypes(List<EAV.AttributeAndType> attributesAndTypes)
attributesAndTypes
- EAV attributes and attribute typesCopyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.