Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2644

The seedslists in NAS are sorted alfabetically which mess with the correct crawlorder of some pages where harvesting of the login-page needs to be crawled first

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • 5.5.1
    • 5.2
    • None
    • None
    • BNF

    Description

      The seedslists in NAS are sorted alfabetically which mess with the correct crawlorder of some pages where harvesting of the login-page needs to be crawled first.

      Question: At what point is the sorting being performed? Heritrix is also known to some seedlist editing of their own.

      Attachments

        Activity

          People

            Unassigned Unassigned
            svc Søren Vejrup Carlsen (Inactive)
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: