Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2150

Heritrix WARCWriterProcessor failure, missing 'harvestInfo.scheduleName' in 'harvestInfo.xml'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • None
    • None
    • None
    • SB/KB

    Description

      No warc records written whatsoever.

      dk.netarkivet.common.exceptions.UnknownID: No elements exists for the path 'harvestInfo.scheduleName' in '/home/test/TEST2/harvester_low/4_1357746071263/harvestInfo.xml'
      at dk.netarkivet.common.utils.SimpleXml.getString(SimpleXml.java:272)
      at dk.netarkivet.harvester.harvesting.metadata.PersistentJobData.getScheduleName(PersistentJobData.java:615)
      at dk.netarkivet.harvester.harvesting.WARCWriterProcessor.getFirstrecordBody(WARCWriterProcessor.java:759)
      at org.archive.crawler.framework.WriterPoolProcessor.cacheMetadata(WriterPoolProcessor.java:669)
      at org.archive.crawler.framework.WriterPoolProcessor.getMetadata(WriterPoolProcessor.java:642)
      at org.archive.io.warc.WARCWriterPool$1.makeObject(WARCWriterPool.java:62)
      at org.apache.commons.pool.impl.FairGenericObjectPool.borrowObject(FairGenericObjectPool.java:262)
      at org.archive.io.WriterPool.borrowFile(WriterPool.java:139)
      at dk.netarkivet.harvester.harvesting.WARCWriterProcessor.write(WARCWriterProcessor.java:251)
      at dk.netarkivet.harvester.harvesting.WARCWriterProcessor.innerProcess(WARCWriterProcessor.java:235)
      at org.archive.crawler.framework.Processor.process(Processor.java:109)
      at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:306)
      at org.archive.crawler.framework.ToeThread.run(ToeThread.java:154)
      2013-01-09 15:41:13.721 SEVERE thread-14 dk.netarkivet.harvester.harvesting.WARCWriterProcessor.innerProcess() Failed write of Record: dns:www.dbc.dk
      java.io.IOException: Failed getting writer from pool: No elements exists for the path 'harvestInfo.scheduleName' in '/home/test/TEST2/harvester_low/4_1357746071263/harvestInfo.xml'
      at org.archive.io.WriterPool.borrowFile(WriterPool.java:159)
      at dk.netarkivet.harvester.harvesting.WARCWriterProcessor.write(WARCWriterProcessor.java:251)
      at dk.netarkivet.harvester.harvesting.WARCWriterProcessor.innerProcess(WARCWriterProcessor.java:235)
      at org.archive.crawler.framework.Processor.process(Processor.java:109)
      at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:306)
      at org.archive.crawler.framework.ToeThread.run(ToeThread.java:154)
      2013-01-09 15:41:14.212 SEVERE thread-15 org.archive.io.WriterPool.borrowFile() E Pool State: Active 0 of max 5, idle 0, time 86ms of max 300000ms

      Attachments

        Issue Links

          Activity

            People

              svc Søren Vejrup Carlsen (Inactive)
              nicl@kb.dk Nicholas Clarke (Inactive)
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: