Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-1314

We schedule new harvests based on harvest definitions that are not yet finished with the previous harvest

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • 0.5
    • None

    Description

      The scheduler schedules harvests based on harvest definitions without consulting
      information about previous harvests at all.
      Thus the same harvest may be scheduled again even if an old harvest from the
      same harvest definition is not yet finished. This may end up with congestion in
      the harvester queues.
      Worse still snapshot harvests based on a previous harvest definition is
      scheduled even if the previous harvest definition is not yet done! The result of
      this is that only the domains that have been harvested so far by the previous
      harvest definition are included in the next phase.
      It can be checked whether previous harvests are done by checking open jobs made
      from this harvest definition for statuses other than FAILED or DONE. Any such
      jobs mean that the harvest is not yet done.
      There is however a risk that some message is lost along the way and a job is
      never updated so at least some sort of warning must be issued if a scheduling
      is delayed due to unfinished harvests.
      NOTE: This bug is originally from Bugzilla bug_id=318.
      This bug was previously assigned to Unassigned.

      Attachments

        Activity

          People

            svc Søren Vejrup Carlsen (Inactive)
            kfc Kåre Fiedler Christiansen (Inactive)
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: