Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Agenda for the joint BNF, ONB, SB, KB and BNE NetarchiveSuite tele-conference 2016-08-23, 13:00-14:00.

Practical information


  • BNF: Lam, Annick
  • ONB: Michaela, Andreas
  • KB/DK: Søren, Stephen, Nicholas
  • SB: Sabine
  • BNE: Mar, Juan Carlos, Fernando, Elena

NAS workshop in Vienna

January 30th 2017 - February 1st 2017 - Vienna

How many participants? Please complete Michaela's poll :

IIPC crawler hackathon in London

September 22-23. Is anyone attending?

NAS 5.2 Developement Update

Feedback from KB/SB.

BnF migration to NAS 5 + H3 Update

Feedback from BnF.

  • installalation of NAS 5.2-snapshot (development + stage/pre-production environment)
    • correcting BnF's deploy scripts (nas-deploy) for NAS 5
    • database migration (we've started to prepare sql scripts to migrate to the new database schema but for the moment we use an empty database to test NAS 5)

  • migrating BnF developements from to
  • minor correction of date format
  • migrating host + domain profils from old order.xml format to the crawler-beans.cxml format
    • done by crawl engineer Sébastien Pivain-Leroy

  • correcting BnF's statistical tool for NAS (nas-qual) in order to handle both H1 and H3 reports format

  • pending : generate warc revisit records in format WARC 1.1
  • pending : archivefiles-report.txt missing GMT dates and closing date
    • NAS-2546 - Getting issue details... STATUS
    • can only correct date format, can't get opened date
    • in dk.netarkivet.harvester.heritrix3.HarvestDocumentation, there is this comment :
      // Generate an arcfiles-report.txt if configured to do so.
      // This is not possible to extract from the crawl.log, but we will make one from just listing the files harvested by Heritrix3

      boolean genArcFilesReport = Settings.getBoolean(Heritrix3Settings.METADATA_GENERATE_ARCHIVE_FILES_REPORT);


  • pending : attempt to launch heritrix instance with another version of Java
    • for instance java 9 for new implementation of for https, but keep java 7 for NAS
    • It looks not so easy to do (see classes HeritrixLauncher & Heritrix3Wrapper)


Status of the production sites











Next meetings

  • September 20
  • October 25
  • November 29
  • January 3, 2017

Any other business?




  • No labels