Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-337

Quadratic scaling in AdminData persistence

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • 0.5
    • Archive
    • None

    Description

      Each time a new checksum or store state is registered
      by the AdminData class it must be written on disk
      (so that we can restart with the same knowledge).
      In the current version this is handled by calling write()
      which writes ALL data to the disk. This is normally 3 lines
      per each ingested file. With 10000 jobs in a few days' harvesting
      we may have 30000 lines or more written to disk on each change of
      store state.
      A DAO would fix this but an quick solution is to append new data to the
      persistent file rather than to write everything from scratch.
      But note that we would then have to have two data files: one for each Map
      that is handled persistently (checksumRefTable and storeStates).
      NOTE: This bug is originally from Bugzilla bug_id=324.
      This bug was originally submitted by NHC.

      Attachments

        Activity

          People

            lars lars [X] (Inactive)
            Anonymous Anonymous
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: