|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectdk.netarkivet.harvester.datamodel.Domain
public class Domain
Represents known information about a domain A domain is identified by a domain name (ex: kb.dk)
The following information is used to control how a domain is harvested: Seedlists, configurations and passwords. Each seedlist defines one or more URL's that the harvester should use as starting points. A configuration defines a specific combination of settings (seedlist, harvester settings, passwords) that should be used during harvest. Passwords define user names and passwords that might be used for the domain. >
Information about previous harvests of this domain is available via the domainHistory. >
Information from the domain registrant (DK-HOSTMASTER) about the domain registration is available in the registration. This includes the dates where the domain was known to exist (included in a domain list), together with domain owner information. >
Notice that each configuration references one of the seedlists by name, and possibly one of the Passwords. >
Field Summary | |
---|---|
(package private) long |
edition
Edition is used by the DAO to keep track of changes. |
protected org.apache.commons.logging.Log |
log
|
Constructor Summary | |
---|---|
protected |
Domain(java.lang.String theDomainName)
Create new instance of a domain. |
Method Summary | |
---|---|
void |
addConfiguration(DomainConfiguration cfg)
Adds a new configuration to the domain. |
void |
addOwnerInfo(DomainOwnerInfo owner)
Add owner information. |
void |
addPassword(Password password)
Adds a password to the domain. |
void |
addSeedList(SeedList seedlist)
Adds a seed list to the domain. |
static java.lang.String |
domainNameFromHostname(java.lang.String hostname)
Return a domain name (last two parts of hostname or IP address). |
AliasInfo |
getAliasInfo()
Returns the alias info for this domain, or null if this domain is not an alias. |
java.util.Iterator<DomainConfiguration> |
getAllConfigurations()
Gets all configurations belonging to this domain. |
java.util.List<DomainConfiguration> |
getAllConfigurationsAsSortedList()
Gets all configurations belonging to this domain. |
DomainOwnerInfo[] |
getAllDomainOwnerInfo()
Get array of domain owner information. |
java.util.Iterator<Password> |
getAllPasswords()
Return the passwords defined for this domain. |
java.util.List<Password> |
getAllPasswordsAsSortedList()
Return the passwords defined for this domain. |
java.util.Iterator<SeedList> |
getAllSeedLists()
Get all seedlists belonging to this domain. |
java.util.List<SeedList> |
getAllSeedListsAsSortedList()
Get all seedlists belonging to this domain. |
java.lang.String |
getComments()
Get the comment of this object. |
DomainConfiguration |
getConfiguration(java.lang.String cfgName)
Returns an already registered configuration. |
java.util.List<java.lang.String> |
getCrawlerTraps()
Returns the list of regexps never to be harvested from this domain, or the empty list if none. |
DomainConfiguration |
getDefaultConfiguration()
Gets the default configuration. |
static Domain |
getDefaultDomain(java.lang.String domainName)
Get a new domain, initialised with default values. |
long |
getEdition()
Get the edition number. |
DomainHistory |
getHistory()
Get the domain history. |
(package private) long |
getID()
Get the ID of this domain. |
java.lang.String |
getName()
Gets the name of this domain. |
Password |
getPassword(java.lang.String name)
Get password information. |
SeedList |
getSeedList(java.lang.String name)
Get a specific seedlist previously added to this domain. |
boolean |
hasConfiguration(java.lang.String configName)
Returns true if this domain has the named configuration. |
(package private) boolean |
hasID()
Check if this harvestinfo has an ID set yet (doesn't happen until the DBDAO persists it). |
boolean |
hasPassword(java.lang.String passwordName)
Returns true if this domain has the named password. |
boolean |
hasSeedList(java.lang.String name)
Return true if the named seedlist exists in this domain. |
static boolean |
isValidDomainName(java.lang.String domainName)
|
void |
removeConfiguration(java.lang.String name)
Removes a configuration from this domain. |
void |
removePassword(java.lang.String name)
Removes a password from this Domain. |
void |
removeSeedList(java.lang.String name)
Removes a seedlist from this Domain. |
(package private) void |
setAliasInfo(AliasInfo aliasInfo)
Set the alias field on this object. |
void |
setComments(java.lang.String comments)
Set the comments for this domain. |
void |
setCrawlerTraps(java.util.List<java.lang.String> regExps)
Sets a list of regular expressions defining urls that should never be harvested from this domain. |
void |
setDefaultConfiguration(java.lang.String cfgName)
Mark a configuration as the default configuration to use. |
void |
setEdition(long theEdition)
Set the edition number. |
(package private) void |
setID(long id)
Set the ID of this domain. |
java.lang.String |
toString()
Return a human-readable representation of this object. |
void |
updateAlias(java.lang.String alias)
Update which domain this domain is considered an alias of. |
void |
updateConfiguration(DomainConfiguration cfg)
Replaces existing configuration with cfg, using cfg.getName() as the id for the configuration. |
void |
updatePassword(Password password)
Updates a password on the domain. |
void |
updateSeedList(SeedList seedlist)
Update a seed list to the domain. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected final org.apache.commons.logging.Log log
long edition
Constructor Detail |
---|
protected Domain(java.lang.String theDomainName)
theDomainName
- Name used to reference the domain
ArgumentNotValid
- if either of the arguments are null or empty,
or if the domain does not match the regex for valid domainsMethod Detail |
---|
public static boolean isValidDomainName(java.lang.String domainName)
public static java.lang.String domainNameFromHostname(java.lang.String hostname)
hostname
- A hostname or IP address.
public static Domain getDefaultDomain(java.lang.String domainName)
domainName
- The name of the domain
ArgumentNotValid
- if name is null or emptypublic void addConfiguration(DomainConfiguration cfg)
cfg
- the configuration that is added
UnknownID
- if the name of the seedlist referenced by cfg is unknown
PermissionDenied
- if a configuration with the same name already exists
ArgumentNotValid
- if null suppliedpublic void addSeedList(SeedList seedlist)
seedlist
- the actual seedslist.
ArgumentNotValid
- if an argument is null
PermissionDenied
- if the seedName already existspublic void updateSeedList(SeedList seedlist)
seedlist
- the actual seedslist.
ArgumentNotValid
- if an argument is null
UnknownID
- if the seedlist.getName() does not existspublic void addPassword(Password password)
password
- A password object to add.
ArgumentNotValid
- if the argument is null
PermissionDenied
- if a password already exists with this namepublic void updatePassword(Password password)
password
- A password object to update.
ArgumentNotValid
- if the argument is null
PermissionDenied
- if no password exists with this namepublic void setDefaultConfiguration(java.lang.String cfgName)
cfgName
-
UnknownID
- when the cfgName does not match an added configuration
ArgumentNotValid
- if cfgName is null or emptypublic DomainConfiguration getConfiguration(java.lang.String cfgName)
cfgName
- the name of an registered configuration
UnknownID
- if the name is not a registered configuration
ArgumentNotValid
- if cfgName is null or emptypublic DomainConfiguration getDefaultConfiguration()
UnknownID
- if no configurations existspublic java.lang.String getName()
getName
in interface Named
public java.lang.String getComments()
Named
getComments
in interface Named
public DomainHistory getHistory()
public SeedList getSeedList(java.lang.String name)
name
- the name of the seedlist to return
ArgumentNotValid
- if name is null or empty
UnknownID
- if no seedlist has been added with the supplied namepublic boolean hasSeedList(java.lang.String name)
name
- String representing a possible seedlist for the domain.
public void removeSeedList(java.lang.String name)
name
- the name of the seedlist to remove
PermissionDenied
- if the seedlist is in use by a configuration or
this is the last seedlist in this Domain
UnknownID
- if the no seedlist exists with the name
ArgumentNotValid
- if a null argument is suppliedpublic void removePassword(java.lang.String name)
name
- the name of the password to remove
PermissionDenied
- if the password is in use by a configuration or
this is the last password in this Domain
UnknownID
- if the no password exists with the name
ArgumentNotValid
- if a null argument is suppliedpublic void removeConfiguration(java.lang.String name)
name
-
ArgumentNotValid
- if name is null or empty
PermissionDenied
- if the default configuration is attempted removed
or if one or more HarvestDefinitions reference the configurationpublic java.util.Iterator<DomainConfiguration> getAllConfigurations()
public java.util.Iterator<SeedList> getAllSeedLists()
public java.util.Iterator<Password> getAllPasswords()
public java.util.List<DomainConfiguration> getAllConfigurationsAsSortedList()
public java.util.List<SeedList> getAllSeedListsAsSortedList()
public java.util.List<Password> getAllPasswordsAsSortedList()
public void addOwnerInfo(DomainOwnerInfo owner)
owner
- ownerpublic DomainOwnerInfo[] getAllDomainOwnerInfo()
public Password getPassword(java.lang.String name)
name
- the id of the password settings to retrieve
UnknownID
- if no password info exists with the id "name"public void setComments(java.lang.String comments)
comments
- public void updateConfiguration(DomainConfiguration cfg)
cfg
- the configuration to update
UnknownID
- if no configuration exists with the
id cfg.getName(). ArgumentNotValid if cfg is null.public boolean hasPassword(java.lang.String passwordName)
passwordName
- the identifier of the password info
public boolean hasConfiguration(java.lang.String configName)
configName
- the identifier of the configuration
public long getEdition()
public void setEdition(long theEdition)
theEdition
- long getID()
void setID(long id)
id
- The new ID for this domain.boolean hasID()
public java.lang.String toString()
toString
in class java.lang.Object
public void setCrawlerTraps(java.util.List<java.lang.String> regExps)
regExps
- The list defining urls never to be harvested.
ArgumentNotValid
- if regExps is nullpublic java.util.List<java.lang.String> getCrawlerTraps()
public AliasInfo getAliasInfo()
public void updateAlias(java.lang.String alias)
alias
- The name (e.g. "netarkivet.dk") of the domain that this
domain is an alias of.
UnknownID
- If the given domain does not exist
IllegalState
- If updating the alias info would violate constraints
of alias: No transitivity, no reflection.void setAliasInfo(AliasInfo aliasInfo)
aliasInfo
- Alias information
ArgumentNotValid
- if the alias info is not for this domain
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |