|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object javax.management.Attribute org.archive.crawler.settings.Type org.archive.crawler.settings.ComplexType org.archive.crawler.settings.ModuleType org.archive.crawler.deciderules.DecideRule org.archive.crawler.deciderules.ConfiguredDecideRule org.archive.crawler.deciderules.PredicatedDecideRule org.archive.crawler.deciderules.SurtPrefixedDecideRule dk.netarkivet.harvester.harvesting.OnNSDomainsDecideRule
public class OnNSDomainsDecideRule
Class that re-creates the SurtPrefixSet to include only domain names according to the domain definition of NetarchiveSuite. The NetarchiveSuite can't use the org.archive.crawler.deciderules.OnDomainsDecideRule because it uses a different domain definition.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType |
---|
org.archive.crawler.settings.ComplexType.MBeanAttributeInfoIterator |
Field Summary | |
---|---|
static java.lang.String |
NON_VALID_DOMAIN
This is what SurtPrefixSet.prefixFromPlain returns for a non valid URI. |
static java.util.regex.Pattern |
SURT_FIRSTPART_PATTERN
Pattern that matches the first part of SURT - until ?? |
Fields inherited from class org.archive.crawler.deciderules.SurtPrefixedDecideRule |
---|
ATTR_ALSO_CHECK_VIA, ATTR_REBUILD_ON_RECONFIG, ATTR_SEEDS_AS_SURT_PREFIXES, ATTR_SURTS_DUMP_FILE, ATTR_SURTS_SOURCE_FILE, DEFAULT_ALSO_CHECK_VIA, DEFAULT_REBUILD_ON_RECONFIG, surtPrefixes |
Fields inherited from class org.archive.crawler.deciderules.ConfiguredDecideRule |
---|
ALLOWED_TYPES, ATTR_DECISION |
Fields inherited from class org.archive.crawler.deciderules.DecideRule |
---|
ACCEPT, PASS, REJECT |
Fields inherited from class org.archive.crawler.settings.ComplexType |
---|
definition, definitionMap |
Constructor Summary | |
---|---|
OnNSDomainsDecideRule(java.lang.String s)
Constructor for the class OnNSDomainsDecideRule. |
Method Summary | |
---|---|
static java.lang.String |
convertToDomain(java.lang.String uri)
Convert a URI to its domain. |
protected void |
myBuildSurtPrefixSet()
Method that rebuilds the SurtPrefixSet to include only topmost domains - according to the domain definition in NetarchiveSuite. |
protected java.lang.String |
prefixFrom(java.lang.String uri)
Generate the SURT prefix that matches the domain definition of NetarchiveSuite. |
protected void |
readPrefixes()
We override the default readPrefixes, because we want to make our prefixes. |
Methods inherited from class org.archive.crawler.deciderules.SurtPrefixedDecideRule |
---|
addedSeed, buildSurtPrefixSet, dumpSurtPrefixSet, evaluate, getSeedfile, kickUpdate |
Methods inherited from class org.archive.crawler.deciderules.PredicatedDecideRule |
---|
decisionFor |
Methods inherited from class org.archive.crawler.deciderules.ConfiguredDecideRule |
---|
singlePossibleNonPassDecision |
Methods inherited from class org.archive.crawler.deciderules.DecideRule |
---|
getController |
Methods inherited from class org.archive.crawler.settings.ModuleType |
---|
addElement, listUsedFiles |
Methods inherited from class org.archive.crawler.settings.ComplexType |
---|
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, toString, unsetAttribute |
Methods inherited from class org.archive.crawler.settings.Type |
---|
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
Methods inherited from class javax.management.Attribute |
---|
getName, hashCode |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String NON_VALID_DOMAIN
public static final java.util.regex.Pattern SURT_FIRSTPART_PATTERN
Constructor Detail |
---|
public OnNSDomainsDecideRule(java.lang.String s)
s
- The name of this DecideRuleMethod Detail |
---|
protected void readPrefixes()
readPrefixes
in class org.archive.crawler.deciderules.SurtPrefixedDecideRule
protected void myBuildSurtPrefixSet()
protected java.lang.String prefixFrom(java.lang.String uri)
prefixFrom
in class org.archive.crawler.deciderules.SurtPrefixedDecideRule
uri
- URL to convert to SURT
public static java.lang.String convertToDomain(java.lang.String uri)
uri
- URL to convert to Top most domain-name according to
NetarchiveSuite definition
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |