The 2017 NAS workshop will take place on April 26-28 and will be hosted by National Library of Austria, Vienna


Address: Austrian National Library / Training Department, Augasse 2-6, 1090 Vienna, Attention - not on historic library premises in the city center!


Public transport: metro lines U4 and U6, tram D

Route planner:


Hotels nearby:

Arthotel ANA Katharina, (2 min walk)

ibis Styles Wien City, (5 min walk)

Hotel Boltzmann, (15-20 min walk)






Colin Rosenthal

Tue Larsen

Søren Vejrup Carlsen

Sabine Schostag

Stephen Hunt


Andreas Predikaka

Michaela Mayr


Sara Aubry

Thomas F.

Géraldine Camile


Juan Carlos García Arratia

Fernando Monzón

Mar Pérez Morillo

NL of SwedenEva Meszaro

Pär Nilsson

Daniel Jansson

Topics to be discussed:

NAS5 / Heritrix 3 - technicalHeritrix 3 - curatorial
  • State of the art of current developments
  • Upcoming developments
  • Introduce a multiple crawlers approach into NAS

  • Videos/social media harvesting
  • What CDX format are you using today and plan to support within next year?

  • Which version of (Open)Wayback are you using today and what do think about the future development of OpenWayback?

  • Performance of Wayback-Index. How to speed it up? Any experience with splitting up the index in several chunks or serving the index from multiple hosts?
  • Which social media can you archive today?

  • How to consolidate crawl.log and frontier search features in NetarchiveSuite?
  • BNF's freetext search (better than KB DK's) - anything to share with the community?
  • Automatic quality assurance. Any Ideas? Proof of concepts?
  • Others ?
  • Feedback on using NAS 5 and Heritrix 3
  • Missing features
  • Priorities for future development
  • Is it possible to connect other tools than Heritrix to NAS (tools that can produce WARC files and capture content, which Heritrix is not able to catch) If so, which tools to we want to use?
  • Revival and update of the curator roadmap
  • Harvest the electoral web: selection, harvest parameters
  • Experiences with harvesting pages with login content (pay walls)
  • Experiences with harvesting images embedded in javascript (and replay them in the archive)
  • Exchange of experiences with documentation of the crawls (in and outside NAS)
  • Others ?

DRAFT agenda

Schedule for 26.04.2017 (12:30-17:30)

12:30 - 14:00 Welcome, sandwiches and coffee

14:00 - 14:30 Workshop introduction (Michaela, Sara)

14:30 - 16:00 Institution updates and plans for 2017

16:00 - 16:15 Coffee break

16:15 - 17:30 NetarchiveSuite 5.3: demo and discussion of latest features and installation challenges (Colin, Sara)

19:30 Dinner (at own expense)

Schedule for 27.04.2017 (9:00-17:00)

09:00 - 11:30

Technical track:

Curator track:

10:30 - 10:45 Coffee break

11:30 - 12:30 Common track: presentation of curators and technical priorities, update of NAS curator road map:

12:30 - 14:00 Lunch

14:00 - 15:30

Common track: Harvesting videos and social medias:

15:30 - 15:45 Coffee break

15:45 - 17:00

Technical track:

Curator track:

17:00 - 19:00 Guided tour at Austrian National Library State Hall (optional)

Schedule for 28.04.2017 (9:00-14:30)

09:00 - 09:45

Common track: Demo of BCweb functionalities (Sara)

9:45 - 11:45

Technical track:

Curator track:

10:30 - 10:45 Coffee break

11:45 - 13:00 Tracks sum-up, community next steps

13:00 - 14:30 Lunch and goodbye