Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2109

metadata://netarkivet.dk/crawl/reports/arcfiles-report.txt is empty when Heritrix set to WARC writing

    XMLWordPrintable

Details

    • Hide

      Can be verified by a simple netarkivet.dk harvest in total warc-mode.
      And then look at the contents of the metadata warc file.
      The contents should include a non-empty metadata://netarkivet.dk/crawl/reports/arcfiles-report.txt record

      Show
      Can be verified by a simple netarkivet.dk harvest in total warc-mode. And then look at the contents of the metadata warc file. The contents should include a non-empty metadata://netarkivet.dk/crawl/reports/arcfiles-report.txt record

    Description

      The report metadata://netarkivet.dk/crawl/reports/arcfiles-report.txt is empty when Heritrix set to WARC writing.

      The reason seems to be that logging must be turned on to INFO logging in heritrix.properties for the WARCWriter class. Currently, the logging is only enabled for the ARCWriter

      Attachments

        Activity

          People

            svc Søren Vejrup Carlsen (Inactive)
            svc Søren Vejrup Carlsen (Inactive)
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: