Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2495

NPE in OAIExtractor in bundled Heritrix3

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1
    • Component/s: None
    • Labels:
      None
    • Verification:
      Hide

      Verified by making a netarkivet.dk harvest with the ExtractorOAI enabled
      before: you got these NPE
      now: you don't

      Show
      Verified by making a netarkivet.dk harvest with the ExtractorOAI enabled before: you got these NPE now: you don't

      Description

      SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
      SLF4J: Defaulting to no-operation (NOP) logger implementation
      SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
      2016-02-03 15:40:03.251 INFO thread-12 org.archive.crawler.framework.Engine.addJobDirectory() added crawl job: 11_1454513930810
      2016-02-03 15:40:04.572 INFO thread-12 org.archive.crawler.framework.CrawlJob.instantiateContainer() Job instantiated
      2016-02-03 15:40:04.832 INFO thread-12 org.archive.crawler.framework.CrawlJob.launch() Job launched
      2016-02-03 15:40:06.352 INFO thread-15 org.archive.spring.PathSharingContext.initLaunchId() launch id 20160203154006
      2016-02-03 15:40:06.704 INFO thread-15 org.archive.io.WriterPool.<init>() Initial configuration: prefix=11-1, template=${prefix}-${timestamp17}-${serialno}-ciblee_2015_${heritrix.hostname}, compress=false, maxSize=1000000000, maxActive=3, maxWait=500
      2016-02-03 15:40:06.756 INFO thread-15 org.archive.crawler.framework.CrawlJob.onApplicationEvent() PREPARING 20160203154006
      2016-02-03 15:40:07.732 INFO thread-20 org.archive.crawler.framework.CrawlController.noteFrontierState() Crawl running.
      2016-02-03 15:40:07.734 INFO thread-20 org.archive.crawler.framework.CrawlJob.onApplicationEvent() RUNNING 20160203154006
      2016-02-03 15:40:16.567 INFO thread-63 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:40:17.231 INFO thread-65 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:40:22.755 INFO thread-65 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:40:31.983 INFO thread-45 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:41:24.145 INFO thread-40 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:41:41.112 INFO thread-26 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:41:44.189 INFO thread-47 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:41:47.180 INFO thread-62 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:41:53.229 INFO thread-67 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:41:57.219 INFO thread-55 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:42:01.249 INFO thread-25 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:42:06.275 INFO thread-36 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:42:14.397 INFO thread-44 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:42:18.307 INFO thread-52 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:42:22.328 INFO thread-63 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:43:46.595 INFO thread-26 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:44:36.524 INFO thread-28 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:46:42.158 INFO thread-23 org.archive.modules.extractor.Extractor.handleException() Exception
      java.lang.NullPointerException
      	at dk.netarkivet.harvester.harvesting.extractor.ExtractorOAI.innerExtract(ExtractorOAI.java:111)
      	at org.archive.modules.extractor.ContentExtractor.extract(ContentExtractor.java:37)
      	at org.archive.modules.extractor.Extractor.innerProcess(Extractor.java:102)
      	at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
      	at org.archive.modules.Processor.process(Processor.java:142)
      	at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
      	at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
      2016-02-03 15:47:37.808 INFO thread-20 org.archive.crawler.framework.CrawlController.noteFrontierState() Crawl empty.
      2016-02-03 15:47:37.808 INFO thread-20 org.archive.crawler.framework.CrawlJob.onApplicationEvent() STOPPING 20160203154006
      2016-02-03 15:47:37.809 INFO thread-20 org.archive.crawler.framework.CrawlJob.onApplicationEvent() EMPTY 20160203154006
      2016-02-03 15:47:39.360 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/crawl-report.txt
      2016-02-03 15:47:39.379 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/seeds-report.txt
      2016-02-03 15:47:39.401 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/hosts-report.txt
      2016-02-03 15:47:39.416 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/source-report.txt
      2016-02-03 15:47:39.422 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/mimetype-report.txt
      2016-02-03 15:47:39.426 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/responsecode-report.txt
      2016-02-03 15:47:39.427 INFO thread-20 org.archive.modules.writer.WARCWriterProcessor.report() final stats: {response={numRecords=582, totalBytes=37483767, contentBytes=37277685, sizeOnDisk=37483767}, totals={numRecords=582, totalBytes=37483767, contentBytes=37277685, sizeOnDisk=37483767}, warcinfo={numRecords=0, totalBytes=0, contentBytes=0, sizeOnDisk=0}}
      2016-02-03 15:47:39.428 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/processors-report.txt
      2016-02-03 15:47:39.434 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/frontier-summary-report.txt
      2016-02-03 15:47:39.434 INFO thread-20 org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote report: /home/devel/TEST6/harvester_high/11_1454513930810/heritrix3/./jobs/11_1454513930810/20160203154006/reports/threads-report.txt
      2016-02-03 15:47:39.434 INFO thread-20 org.archive.crawler.framework.CheckpointService.stop() Cleaned up Checkpoint TimerThread.
      2016-02-03 15:47:39.440 INFO thread-20 org.archive.crawler.framework.CrawlJob.onApplicationEvent() FINISHED 20160203154006
      2016-02-03 15:47:39.440 INFO thread-20 org.archive.crawler.frontier.AbstractFrontier.crawlEnded() Closing with 0 urls still in queue.
      2016-02-03 15:51:06.987 INFO thread-79 org.archive.crawler.framework.CrawlJob.doTeardown() Job instance discarded
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                svc Søren Vejrup Carlsen
                Reporter:
                svc Søren Vejrup Carlsen
              • Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: