Class OnNSDomainsDecideRule

  • All Implemented Interfaces:, java.util.EventListener, org.archive.checkpointing.Checkpointable, org.archive.modules.seeds.SeedListener, org.archive.spring.HasKeyedProperties, org.springframework.beans.factory.Aware, org.springframework.beans.factory.BeanNameAware, org.springframework.context.ApplicationListener<org.springframework.context.ApplicationEvent>

    public class OnNSDomainsDecideRule
    extends org.archive.modules.deciderules.surt.SurtPrefixedDecideRule
    Class that re-creates the SurtPrefixSet to include only domain names according to the domain definition of NetarchiveSuite. The NetarchiveSuite can't use the org.archive.crawler.deciderules.OnDomainsDecideRule because it uses a different domain definition.
    • Field Summary

      Modifier and Type Field Description
      static java.lang.String NON_VALID_DOMAIN
      This is what SurtPrefixSet.prefixFromPlain returns for a non valid URI.
      static java.util.regex.Pattern SURT_FIRSTPART_PATTERN
      Pattern that matches the first part of SURT - until ??
    • Constructor Summary

      Constructor Description
      Constructor for the class OnNSDomainsDecideRule.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String convertToDomain​(java.lang.String uri)
      Convert a URI to its domain.
      protected void myBuildSurtPrefixSet()
      Method that rebuilds the SurtPrefixSet to include only topmost domains - according to the domain definition in NetarchiveSuite.
      protected java.lang.String prefixFrom​(java.lang.String uri)
      Generate the SURT prefix that matches the domain definition of NetarchiveSuite.
      protected void readPrefixes()
      We override the default readPrefixes, because we want to make our prefixes.
    • Field Detail


        public static final java.util.regex.Pattern SURT_FIRSTPART_PATTERN
        Pattern that matches the first part of SURT - until ??
    • Constructor Detail

      • OnNSDomainsDecideRule

        public OnNSDomainsDecideRule()
        Constructor for the class OnNSDomainsDecideRule. Makes the configured decision for any URI which is inside one of the domains in the configured set of domains - according to the domain definition of the NetarchiveSuite system. Giving that e.g. will resolve to but will resolve to"
    • Method Detail

      • readPrefixes

        protected void readPrefixes()
        We override the default readPrefixes, because we want to make our prefixes.
        readPrefixes in class org.archive.modules.deciderules.surt.SurtPrefixedDecideRule
      • myBuildSurtPrefixSet

        protected void myBuildSurtPrefixSet()
        Method that rebuilds the SurtPrefixSet to include only topmost domains - according to the domain definition in NetarchiveSuite. This is only done once, during the startup phase?
      • prefixFrom

        protected java.lang.String prefixFrom​(java.lang.String uri)
        Generate the SURT prefix that matches the domain definition of NetarchiveSuite.
        prefixFrom in class org.archive.modules.deciderules.surt.SurtPrefixedDecideRule
        uri - URL to convert to SURT
        String with SURT that matches the domain definition of NetarchiveSuite
      • convertToDomain

        public static java.lang.String convertToDomain​(java.lang.String uri)
        Convert a URI to its domain.
        uri - URL to convert to Top most domain-name according to NetarchiveSuite definition
        Domain name