Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2687

Incomplete lines in the duplicationMigration are not caught in RawMetadataCache.migrateDuplicates()

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 5.2.3, 5.4
    • 5.2.2, 5.3.1
    • IndexServer
    • None
    • Hide

      The test, presumably is to go into the archive, deliberately mess up a crawl log line, generate an index, and check the log. I think this is a lot of work for such a very localised change, and I'm not sure it is worth it on a cost/benefit/risk basis.

      Show
      The test, presumably is to go into the archive, deliberately mess up a crawl log line, generate an index, and check the log. I think this is a lot of work for such a very localised change, and I'm not sure it is worth it on a cost/benefit/risk basis.

    Description

      In the compression project, we have managed to write incomplete lines to the duplicationMigration record in the metadata file.

      This causes a ArrayIndexOutOfBoundsException:

      15:46:53.179 INFO  d.n.h.indexserver.RawMetadataCache - 214466 migration records found for job 250990
      15:46:53.626 WARN  d.n.h.i.d.IndexRequestServer - Unable to generate index for jobs [250990]
      java.lang.ArrayIndexOutOfBoundsException: 2
              at dk.netarkivet.harvester.indexserver.RawMetadataCache.migrateDuplicates(RawMetadataCache.java:208) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.RawMetadataCache.cacheData(RawMetadataCache.java:142) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.RawMetadataCache.cacheData(RawMetadataCache.java:57) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.FileBasedCache.cache(FileBasedCache.java:146) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.FileBasedCache.get(FileBasedCache.java:174) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.CombiningMultiFileBasedCache.prepareCombine(CombiningMultiFileBasedCache.java:88) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.CrawlLogIndexCache.prepareCombine(CrawlLogIndexCache.java:106) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.CombiningMultiFileBasedCache.cacheData(CombiningMultiFileBasedCache.java:69) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.CombiningMultiFileBasedCache.cacheData(CombiningMultiFileBasedCache.java:43) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.FileBasedCache.cache(FileBasedCache.java:146) ~[harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.distribute.IndexRequestServer.doProcessIndexRequestMessage(IndexRequestServer.java:336) [harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.distribute.IndexRequestServer.access$000(IndexRequestServer.java:76) [harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
              at dk.netarkivet.harvester.indexserver.distribute.IndexRequestServer$2.run(IndexRequestServer.java:238) [harvester-core-5.3.2-RC1.jar:20fc2e1cb5158341eee04029fe1920934ed38048]
      15:46:53.627 INFO  d.n.h.i.d.IndexRequestServer - Sending failed reply for IndexRequestMessage back to sender '[Queue 'PROD_COMMON_THIS_INDEX_CLIENT_130_226_228_70_VP_8094']'.
      

      Attachments

        Activity

          People

            svc Søren Vejrup Carlsen (Inactive)
            svc Søren Vejrup Carlsen (Inactive)
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - Not Specified
                Not Specified
                Logged:
                Time Spent - 1m
                1m