Page tree
Skip to end of metadata
Go to start of metadata

Time: 29 October 2012 (9-17) / 30 October 2012 (9-15)


Planned participants:





Sara Aubry, Bert Wendland

Clément Oury, Annick Lorthios, Sophie Derrot





Søren, Nicholas

Jón, Tue


Mikis Seth Sørensen
Colin Rosenthal

Sabine, Bjarne (first day)
Jens and Charlotte for curator and research tracks

Researchers Niels Ole Finnemann, Niels Brügger, Ulrich Karstoft Have

Topics to be discussed

JhoNAS/WARCNetarchiveSuiteHeritrix 3
  • JhoNAS: feedback, discussions on testing the Jhove2 WARC module
  • JhoNAS: feedback, discussions on testing the NetarchiveSuite WARC implementation
  • Presentation of JWATools
  • Report/Feedback on the JhoNAS project
  • Discussions on switching to the WARC format (harvest, access, preservation)
  • NetarchiveSuite: daily work on focused crawls in Aarhus
  • NetarchiveSuite: how to manage the deposit of individual files/journals?
  • NetarchiveSuite: new features (recently implemented, under development, future work) 
  • NetarchiveSuite community: future developments, new comers, wiki

  • What is to be done after London workshop


Mikis and Colin will be waiting at the Statsbiblioteket reception by the front entrance from around 8:45. Mikis will also be available on phone +45 40409152. Colin can be contacted on +45 26462564.

Schedule for Day 1 (Monday 29)

09:00 - 10:00 Welcome and breakfast

Location: Large meeting room basement, 01.93

10:00 - 10:15 Workshop agenda (Sara)

Summary of the agenda, including any last minute additions.

10:15 - 11:30 Institution updates (all)

Each institution presenting the main work topics for 2013 and the recent/upcoming developments.

11:30 - 12:00 Managing the deposit of individual files (Bjarne?, Clément, others?)

How to manage the manual deposit of files (ebooks, journals....) into the main production workflow? Presentation of SB/KB experience and BnF upcoming projects.

  • Discussion of the problems of including inherently structured data, like eBooks into the 'unstructured' web archive.
  • Also discussion of how individual files deposit should be located organizational.
  • A eBook etc section on the curator wiki was proposed.

12:00 - 13:00 Heritrix 3 (Soren)

Outcomes from the London workshop. Which contributions from the NetarchiveSuite community?

Se IIPC Heritrix Task Force - London Meeting report.docx for outcome of the workshop.

We have no immediate plans for migrating to H3, as it would require som work to extend the current H3 interfaces to provide the functionality needed by NAS.

The current focus will be to try to establish a longterm development organization for H3, possibly through IIPC directly.


13:00 - 14:00 Lunch

14:00 - 17:00 Common track splits into 2 tracks: NetarchiveSuite / Jhonas

19:00 - Dinner

Restaurant Komfur, find it here.

Schedule for Day 2 (Tuesday 30)

Location: Large meeting room basement, 01.93

09:00 - 10:45 Moving to the WARC format

11:00 - 12:00 What's next?

Actions to come for the NAS community. Potential new comers. Tools. Work organization.

  • We need to continue the efforts to create a operational curator wishlist for improvement to NetarchiveSuite. It was decided to used the current POC curator list found in JIRA to model the curator wishes. Sara will together with relevant curators try to update the list found here with the current status concluding in a teleconference. Afterwards the list will be used as input to the NAS development project iteration planning (ultimo january). Note, that bugs should be created directly in the NAS project when discovered, but should still be done so after discussion the issue with a NAS developer.
  • The next meetings will properly be at KB monday-tuesday, in the first week of october 2013. It was considered to invite NAS users outside of the current contributors, and even to locate the next meetings at one of their locations.


12:00 - 13:00 Lunch

13:00 - 15:00 Discussion with researchers