public class Domain extends ExtendableEntity implements Named
The following information is used to control how a domain is harvested: Seedlists, configurations and passwords. Each seedlist defines one or more URL's that the harvester should use as starting points. A configuration defines a specific combination of settings (seedlist, harvester settings, passwords) that should be used during harvest. Passwords define user names and passwords that might be used for the domain.
Information about previous harvests of this domain is available via the domainHistory.
Information from the domain registrant (DK-HOSTMASTER) about the domain registration is available in the registration. This includes the dates where the domain was known to exist (included in a domain list), together with domain owner information.
Notice that each configuration references one of the seedlists by name, and possibly one of the Passwords.
Modifier and Type | Field and Description |
---|---|
protected static org.slf4j.Logger |
log
The logger for this class.
|
extendedFieldValues
Modifier | Constructor and Description |
---|---|
protected |
Domain(String theDomainName)
Create new instance of a domain.
|
Modifier and Type | Method and Description |
---|---|
void |
addConfiguration(DomainConfiguration cfg)
Adds a new configuration to the domain.
|
void |
addOwnerInfo(DomainOwnerInfo owner)
Add owner information.
|
void |
addPassword(Password password)
Adds a password to the domain.
|
void |
addSeedList(SeedList seedlist)
Adds a seed list to the domain.
|
AliasInfo |
getAliasInfo()
Returns the alias info for this domain, or null if this domain is not an alias.
|
Iterator<DomainConfiguration> |
getAllConfigurations()
Gets all configurations belonging to this domain.
|
List<DomainConfiguration> |
getAllConfigurationsAsSortedList(Locale loc)
Gets all configurations belonging to this domain.
|
DomainOwnerInfo[] |
getAllDomainOwnerInfo()
Get array of domain owner information.
|
Iterator<Password> |
getAllPasswords()
Return the passwords defined for this domain.
|
List<Password> |
getAllPasswordsAsSortedList(Locale loc)
Returns the passwords defined for this domain.
|
Iterator<SeedList> |
getAllSeedLists()
Get all seedlists belonging to this domain.
|
List<SeedList> |
getAllSeedListsAsSortedList(Locale loc)
Gets all seedlists belonging to this domain.
|
HarvestInfo |
getBestHarvestInfoExpectation(String configName)
Gets the harvest info giving best information for expectation or how many objects a harvest using a given
configuration will retrieve, we will prioritise the most recently harvest, where we have a full harvest.
|
String |
getComments()
Get the comment of this object.
|
DomainConfiguration |
getConfiguration(String cfgName)
Returns an already registered configuration.
|
List<String> |
getCrawlerTraps()
Returns the list of regexps never to be harvested from this domain, or the empty list if none.
|
DomainConfiguration |
getDefaultConfiguration()
Gets the default configuration.
|
static Domain |
getDefaultDomain(String domainName)
Get a new domain, initialised with default values.
|
long |
getEdition()
Get the edition number.
|
protected int |
getExtendedFieldType()
All derived classes allow ExtendedFields from Type ExtendedFieldTypes.DOMAIN
|
DomainHistory |
getHistory()
Get the domain history.
|
long |
getID()
Get the ID of this domain.
|
String |
getName()
Gets the name of this domain.
|
Password |
getPassword(String name)
Get password information.
|
SeedList |
getSeedList(String name)
Get a specific seedlist previously added to this domain.
|
boolean |
hasConfiguration(String configName)
Returns true if this domain has the named configuration.
|
boolean |
hasPassword(String passwordName)
Returns true if this domain has the named password.
|
boolean |
hasSeedList(String name)
Return true if the named seedlist exists in this domain.
|
void |
removeConfiguration(String configName)
Removes a configuration from this domain.
|
void |
removePassword(String name)
Removes a password from this Domain.
|
void |
removeSeedList(String name)
Removes a seedlist from this Domain.
|
void |
setComments(String comments)
Set the comments for this domain.
|
void |
setCrawlerTraps(List<String> regExps,
boolean strictMode)
Sets a list of regular expressions defining urls that should never be harvested from this domain.
|
void |
setDefaultConfiguration(String cfgName)
Mark a configuration as the default configuration to use.
|
void |
setEdition(long theNewEdition)
Set the edition number.
|
String |
toString()
Return a human-readable representation of this object.
|
void |
updateAlias(String alias)
Update which domain this domain is considered an alias of.
|
void |
updateConfiguration(DomainConfiguration cfg)
Replaces existing configuration with cfg, using cfg.getName() as the id for the configuration.
|
void |
updatePassword(Password password)
Updates a password on the domain.
|
void |
updateSeedList(SeedList seedlist)
Update a seed list to the domain.
|
addExtendedFieldValue, addExtendedFieldValues, getExtendedFieldValue, getExtendedFieldValues, setExtendedFieldValues, updateExtendedFieldValue
protected static final org.slf4j.Logger log
protected Domain(String theDomainName)
theDomainName
- Name used to reference the domainArgumentNotValid
- if either of the arguments are null or empty, or if the domain does not match the regex
for valid domainspublic static Domain getDefaultDomain(String domainName)
domainName
- The name of the domainArgumentNotValid
- if name is null or emptypublic void addConfiguration(DomainConfiguration cfg)
cfg
- the configuration that is addedUnknownID
- if the name of the seedlist referenced by cfg is unknownPermissionDenied
- if a configuration with the same name already existsArgumentNotValid
- if null suppliedpublic void addSeedList(SeedList seedlist)
seedlist
- the actual seedslist.ArgumentNotValid
- if an argument is nullPermissionDenied
- if the seedName already existspublic void updateSeedList(SeedList seedlist)
seedlist
- the actual seedslist.ArgumentNotValid
- if an argument is nullUnknownID
- if the seedlist.getName() does not existspublic void addPassword(Password password)
password
- A password object to add.ArgumentNotValid
- if the argument is nullPermissionDenied
- if a password already exists with this namepublic void updatePassword(Password password)
password
- A password object to update.ArgumentNotValid
- if the argument is nullPermissionDenied
- if no password exists with this namepublic void setDefaultConfiguration(String cfgName)
cfgName
- a name of a configurationUnknownID
- when the cfgName does not match an added configurationArgumentNotValid
- if cfgName is null or emptypublic DomainConfiguration getConfiguration(String cfgName)
cfgName
- the name of an registered configurationUnknownID
- if the name is not a registered configurationArgumentNotValid
- if cfgName is null or emptypublic DomainConfiguration getDefaultConfiguration()
UnknownID
- if no configurations existspublic String getComments()
Named
getComments
in interface Named
public DomainHistory getHistory()
public SeedList getSeedList(String name)
name
- the name of the seedlist to returnArgumentNotValid
- if name is null or emptyUnknownID
- if no seedlist has been added with the supplied namepublic boolean hasSeedList(String name)
name
- String representing a possible seedlist for the domain.public void removeSeedList(String name)
name
- the name of the seedlist to removePermissionDenied
- if the seedlist is in use by a configuration or this is the last seedlist in this DomainUnknownID
- if the no seedlist exists with the nameArgumentNotValid
- if a null argument is suppliedpublic void removePassword(String name)
name
- the name of the password to removePermissionDenied
- if the password is in use by a configuration or this is the last password in this DomainUnknownID
- if the no password exists with the nameArgumentNotValid
- if a null argument is suppliedpublic void removeConfiguration(String configName)
configName
- The name of a configuration to remove.ArgumentNotValid
- if name is null or emptyPermissionDenied
- if the default configuration is attempted removed or if one or more HarvestDefinitions
reference the configurationpublic Iterator<DomainConfiguration> getAllConfigurations()
public Iterator<SeedList> getAllSeedLists()
public Iterator<Password> getAllPasswords()
public List<DomainConfiguration> getAllConfigurationsAsSortedList(Locale loc)
loc
- contains the language sorting must adhere topublic List<SeedList> getAllSeedListsAsSortedList(Locale loc)
loc
- contains the language sorting must adhere topublic List<Password> getAllPasswordsAsSortedList(Locale loc)
loc
- contains the language sorting must adhere topublic void addOwnerInfo(DomainOwnerInfo owner)
owner
- ownerpublic DomainOwnerInfo[] getAllDomainOwnerInfo()
public Password getPassword(String name)
name
- the id of the password settings to retrieveUnknownID
- if no password info exists with the id "name"public void setComments(String comments)
comments
- The new comments (can be null)public void updateConfiguration(DomainConfiguration cfg)
cfg
- the configuration to updateUnknownID
- if no configuration exists with the id cfg.getName(). ArgumentNotValid if cfg is null.public boolean hasPassword(String passwordName)
passwordName
- the identifier of the password infopublic boolean hasConfiguration(String configName)
configName
- the identifier of the configurationpublic long getEdition()
public void setEdition(long theNewEdition)
theNewEdition
- the new editionpublic long getID()
public void setCrawlerTraps(List<String> regExps, boolean strictMode)
regExps
- The list defining urls never to be harvested.strictMode
- If true, we throw ArgumentNotValid exception if invalid regexps are foundArgumentNotValid
- if regExps is null or regExps contains invalid regular expressions (unless strictMode is
false).public List<String> getCrawlerTraps()
public AliasInfo getAliasInfo()
public void updateAlias(String alias)
alias
- The name (e.g. "netarkivet.dk") of the domain that this domain is an alias of.UnknownID
- If the given domain does not existIllegalState
- If updating the alias info would violate constraints of alias: No transitivity, no
reflection.public HarvestInfo getBestHarvestInfoExpectation(String configName)
configName
- The name of the configurationprotected int getExtendedFieldType()
getExtendedFieldType
in class ExtendableEntity
Copyright © 2005–2018 The Royal Danish Library, the National Library of France and the Austrian National Library.. All rights reserved.