Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 36 Next »

The 2019 NAS workshop will take place on February 20-22 and will be hosted by National Library of Spain in Madrid.

Location: National Library of Spain (entry at the ground floor)

Address: Paseo de Recoletos, 20-22 - 28071-Madrid


Hotels: see below






Colin Samuel Rosenthal

Knud Aage Hansen 

Tue Hejlskov Larsen

Kristian Bak

Anders Klindt Myrvoll

Sabine Schostag

Stephen Hunt

ONB (via Skype)

Andreas P



Sara Aubry

Clara Wiatrowski

Géraldine Camile


Juan Carlos García

José María Martín

Fernando Monzón

Luis Sánchez

Nuria Serrano

Alicia Pastrana

María Bueno

María Ezquerra

Mar Pérez

NL of SwedenThomas Roos

Pär Nilsson

Peter Svanberg

Topics to be discussed:

  • State of the art of current bugs and possible fixes
  • State of the art of current developments and upcoming developments  in NetarchiveSuite
  • Integration of latest H3 stable release
  • Videos, social media, Umbra (installation, configuration tests and usage)
  • Introducing WARC 1.1
  • Brainstorming on priorities for future developments
  • Brief state of the art on access tools: use and perspectives in the different institutions
  • SolrWayback demo
  • How does your NAS Deployment/Configuration look like (Settings, Hardware)?
  • Oracle java is subject to a fee. ONB's IT department would like to switch completely to OpenJava if possible. Does NAS work as usual on OpenJava? What do you think? What will you do?

  • NAS missing features
  • Brainstorming on priorities for future developments
  • Scheduling harvests at a precise date or period
  • Presentation of BCweb latest release and futures evolutions
  • Presentation of how do we collect and crawl youtube and give access
  • Coordination of external selections
  • What documentation shall we provide for researchers?
  • How do we make workspaces for researchers - tools, limits?
  • Capturing social media
  • Webarchives and digital preservation
  • Which browser and version do we support today and in the near future in harvester requests, Umbra and in archive acccess? Umbra usage and experiences
  • OpenWayback and CDX creation issues and development - and experiences with other tools e.g. pywb, SOLRWayback?
  • Broad crawls
    • How do we make job monitoring during broad or big “deep” crawl’s?
    • How do we manage huge webhotels?
    • How do we manage byte/objects limits for different groups of domains?


Schedule for 20.02.2019 (12:30-17:30)

12:30 - 14:00 Arrival, sandwiches and coffee

14:00 - 14:15 Welcome (Ana Santos Aramburo, director of BNE)

14:15 - 14:30 Workshop introduction (Mar, Sara)

14:30 - 16:00 Institution updates and plans for 2019 (15 min each)

  • Update from ONB (Michaela, Andreas)
  • Update from BNE (Mar, Juan Carlos)
  • Update from KB Sweden (Pär, Thomas)
  • Update from BnF (Géraldine)
  • Update from KB Danemark (Sabine, Tue)

16:00 - 16:15 Coffee break

16:15 - 17:30 NetarchiveSuite 5.5: demo and discussion of latest features including Umbra (Colin), Umbra usage and experiences, Feedback on tests (input from Clara, Tue, Andreas, ?). State of the art of current and upcoming developments (Colin)

20:30 Dinner (at own expense)

Schedule for 21.02.2019 (9:00-17:00)

09:00 - 12:00

Technical track:

  • Share NAS deployment and configuration in our institutions to identify used/unused components (Colin to draft a "form" for common language, each institution to fill it in)
  • Discuss state of the art of current bugs and possible fixes
    • Current bugs identified during ONB Domain Crawl 2018 (Andreas)
    • Current bugs identified during BnF Domain Crawl 2018 (Sara, Clara)
  • Review lists of NAS bugs and missing features from 2017 DK list - BnF list and internal lists, labell "Madrid" issues of interest
  • Review of issues labelled "Madrid":

    Key Summary Created Reporter

  • Discuss possible integration of OpenJava, latest H3 stable release and WARC 1.1
  • Brainstorm on priorities and NAS codebase evolution for future developments
  • Discuss the possibility to submit an IIPC project

Curator track:

  • Review and update the NAS curator roadmap (NASC) with comments from Vienna meeting (Sabine)
  • Brainstorm on priorities for future developments from a curatorial perspective
  • Discuss what type of documentation to provide for researchers (Michaela)
  • Discuss practices and challenges in coordinating external selections (Géraldine, Sabine, Mar?)

10:30 - 10:45 Coffee break

12:00 - 12:30 Sum up of curators and technical priorities

12:30 - 14:00 Lunch

14:00 - 17:00 Complex harvesting

Share experiences, practices and questions in the management of broad crawls (input from Tue, Sara, BNE, others?):

  • How do we make job monitoring during broad or big “deep” crawl’s?
  • How do we manage huge webhotels?
  • How do we track web parkings?
  • How do we manage byte/objects limits for different groups of domains?

  • How do we manage many harvesters running at the same time?

Share experiences and practices in crawling and giving access to YouTube videos (Sara, ?)

Share experiences in crawling social media (which media? who?)

Discuss possible further cooperation on these topics, common tools integration

15:30 - 15:45 Coffee break

17:00 - 19:00 Guided tour of the BNE

Schedule for 22.02.2019 (9:00-14:30)

09:00 - 10:30 Update on BCweb (Géraldine, Clara)

  • Demo of BCweb new functionalities
  • Update on BnF current and upcoming developments
  • Update on open source status
  • Discuss interest in upgrading and possible community developements

10:30 - 10:45 Coffee break

10:45 - 12:30 Access tools to webarchives:

  • Discuss perspectives, projects and questions in the different institutions (input Sara, Géraldine, ?)
  • Browser, OpenWayback and CDX creation issues and development, experiences with other tools e.g. pywb, SOLRWayback
  • ??? Demo of SolrWayback

12:30 - 13:00 Community next steps

13:00 - 14:30 Lunch and goodbye


Hotels nearby: We suggest to look for some offers on for these hotels below, as the Library can't provide special offers.


  • No labels