Page tree
Skip to end of metadata
Go to start of metadata

Agenda for the joint NetarchiveSuite tele-conference 2019-06-04, 13:00-14:00.


  • BNF: Sara, Géraldine, Clara
  • ONB: Andreas
  • KB/DK - Copenhagen: Tue, Stephen
  • KB/DK - Aarhus: Knud Åge
  • BNE: -
  • KB/Sweden: Par, Thomas, Peter

Update on NAS latest tests and developments

Status of the production sites


Our main focus are the elections for the European Parliament and the Danish parliamentary elections, which take place today.  As the national parliamentary elections were announced under the campaign for the European parliamentary elections, we decided to merge the two events in one event collection:

  • We do not collect Danish news media, the regular selective crawls cover this part.
  • We crawl Twitter, and Instagram accounts from political parties and candidates with Heritrix, and Facebook accounts with our Archive-IT account.
  • We crawl YouTube videos and podcasts on the elections

We will finish the crawls, when the new government is in place.

We finished our 2. Broad crawl for 2019 on last Monday.


At the beginning of May we had a meeting to prepare the program of the 2019 broad crawl. We have contacted the registrars which answered positvely in 2018. We will try this year to clean up even more the seed list because we noticed that a lot of registered domains are in fact parking websites or just domain name bookings. We are also going to develop all the subjects raised during the NAS workshop. The launch of the broad crawl is scheduled for October.

On April 10th, the daily newspaper "Le Monde" announced that it will close its blog platform "les blogs abonnés du" at the very beginning of June. We contacted their product owner and harvested 6 250 blogs. To avoid any performance issue on the platform due to Heritrix crawling, we chose to use a maximum of 3 threads in a single job.


  • We are waiting for finishing our hardware exchange. Should be finished in a few weeks, so we could start our Domain Crawl by the beginning of July.
  • We started our Event-Crawl about our government crises
  • We were invited to Bonn in Germany to present the Webarchiv to people of regional libraries in Germany. I also met Jerome Schweitzer from the BNU Library of Strasbourg who was talking about how regional libraries are working together with the BNF. He showed how they archive Websites with BCWeb.


We don't know yet when we will start our annual broad crawl due to the changes in our IT Team. We will probably have a date after the IIPC Meeting.

We continue working in the collections about the different elections: European Parliament elections, local elections and Spanish Government elections. We probably finish Spanish Government elections this month because It took place at the end of April.

We keep the daily harvests in every event collections to collect Facebook and Twitter.

Alicia was at the Digital Library Futures: Symposium on Non-Print Legal Deposit that took place in Cambridge on May 21. They talked about the White Paper that was published by the Arts and Humanities Research Council about the impact of Non-Print Legal Deposit (NPLD) upon UK academic deposit libraries and their users. Maybe it is of your interest:


Next meetings

  • July 2
  • September 10
  • October 8
  • November 5
  • December 3
  • January 7, 2020

Any other business?


  • No labels