public class OnNSDomainsDecideRule extends org.archive.crawler.deciderules.SurtPrefixedDecideRule
Modifier and Type | Field and Description |
---|---|
static String |
NON_VALID_DOMAIN
This is what SurtPrefixSet.prefixFromPlain returns for a non valid URI.
|
static Pattern |
SURT_FIRSTPART_PATTERN
Pattern that matches the first part of SURT - until ??
|
ATTR_ALSO_CHECK_VIA, ATTR_REBUILD_ON_RECONFIG, ATTR_SEEDS_AS_SURT_PREFIXES, ATTR_SURTS_DUMP_FILE, ATTR_SURTS_SOURCE_FILE, DEFAULT_ALSO_CHECK_VIA, DEFAULT_REBUILD_ON_RECONFIG, surtPrefixes
Constructor and Description |
---|
OnNSDomainsDecideRule(String s)
Constructor for the class OnNSDomainsDecideRule.
|
Modifier and Type | Method and Description |
---|---|
static String |
convertToDomain(String uri)
Convert a URI to its domain.
|
protected void |
myBuildSurtPrefixSet()
Method that rebuilds the SurtPrefixSet to include only topmost domains - according to the domain definition in
NetarchiveSuite.
|
protected String |
prefixFrom(String uri)
Generate the SURT prefix that matches the domain definition of NetarchiveSuite.
|
protected void |
readPrefixes()
We override the default readPrefixes, because we want to make our prefixes.
|
addedSeed, buildSurtPrefixSet, dumpSurtPrefixSet, evaluate, getSeedfile, kickUpdate
singlePossibleNonPassDecision
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, toString, unsetAttribute
public static final String NON_VALID_DOMAIN
public static final Pattern SURT_FIRSTPART_PATTERN
public OnNSDomainsDecideRule(String s)
s
- The name of this DecideRuleprotected void readPrefixes()
readPrefixes
in class org.archive.crawler.deciderules.SurtPrefixedDecideRule
protected void myBuildSurtPrefixSet()
protected String prefixFrom(String uri)
prefixFrom
in class org.archive.crawler.deciderules.SurtPrefixedDecideRule
uri
- URL to convert to SURTpublic static String convertToDomain(String uri)
uri
- URL to convert to Top most domain-name according to NetarchiveSuite definitionCopyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.