Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2650

harvestInfo.origHarvestDefinitionComments are missing in warcinfo records

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 5.4
    • 5.3, 5.3.1
    • WARC
    • None
    • BNF
    • NAS 5.4

    Description

      harvestInfo.origHarvestDefinitionComments are not included in warcinfo records of WARC data files.
      {{harvestInfo.version: 0.5
      harvestInfo.jobId: 18099
      harvestInfo.channel: PRESSE
      harvestInfo.harvestNum: 1474
      harvestInfo.origHarvestDefinitionID: 28
      harvestInfo.maxBytesPerDomain: -1
      harvestInfo.maxObjectsPerDomain: 10000
      harvestInfo.orderXMLName: page+1actu
      harvestInfo.origHarvestDefinitionName: BnF actualites quotidienne micro
      harvestInfo.scheduleName: quotidienne
      harvestInfo.harvestFilenamePrefix: BnF-18099-28
      harvestInfo.jobSubmitDate: 2016-03-13T09:00:24Z}}

      When trying to fix it, we ran into another bug: Non-ASCII characters are misencoded (for this field and other text field: origHarvestDefinitionName, HarvestTemplate).

      Attachments

        Activity

          People

            Unassigned Unassigned
            sara Sara Aubry
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: