Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2171

Generation of snapshot jobs takes too long

    XMLWordPrintable

Details

    • Bug
    • Resolution: Rejected
    • Blocker
    • None
    • 4.0
    • Harvest Definition
    • None
    • SB/KB

    Description

      It seems that NetarchiveSuite performs poorly, when generating jobs for a snapshot harvest, at least when using the postgreSQL database.

      It could, however, also be an algorithmic issue.

      The performance, we have seen in Netarkivet prod and test-environment, is 1 job generated each 65 minutes handling 10000 domains. The number of domains in netarkivet domains database is now close to 2 mio domains.

      That means over 200 hours (= 9 days) to generate the full complement of jobs in the snapshot harvest.

      Attachments

        Activity

          People

            Unassigned Unassigned
            svc Søren Vejrup Carlsen (Inactive)
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: