Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Agenda for the joint NetarchiveSuite tele-conference 2020-10-06, 13:00-14:00.


  • BNF: Clara, Sara
  • ONB: Andreas
  • KB/DK - Copenhagen: Tue, Stephen, Anders 
  • KB/DK - Aarhus: Kristian, Colin
  • BNE: Alicia
  • KB/Sweden: Pär, Peter

Join from PC, Mac, Linux, iOS or Android:

Or an H.323/SIP room system:

    Meeting ID: 104 443 571

    SIP: 104443571@

Or Skype for Business (Lync):

Or Telephone:

Denmark: +45 89 88 37 88 or +45 32 71 31 57
United Kingdom: +44 203 051 2874 or +44 203 481 5237 or +44 203 966 3809 or +44 131 460 1196
Finland: +358 9 4245 1488 or +358 3 4109 2129
Sweden: +46 850 539 728 or +46 8 4468 2488
Norway: +47 7349 4877 or +47 2396 0588
US: +1 669 900 6833 or +1 646 558 8656
    Meeting ID: 104 443 571

    International numbers available:

You can join a meeting by using apps from a pc, a tablet or a smartphone, but you can also use the browser based version (it works with newer versions of Chrome or Firefox)

Update on NAS latest tests and developments

Any feedback on NAS 6.0 ?

Status of the production sites


Broad crawl
Step 2 is proceeding in a great fashion.

Event crawl
We decided to continue with the event crawl on Corona in Denmark but with lower frequency and. 0-hop sites reduced greatly, and with minimal curational activity.

Alexandre, trainee
Arrived and is up and running, working remotely from Copenhagen with the rest of the team. We are almost done with the intro-program and are looking into what will give most value to Alexandre, Netarkivet and also BnF.

IT-University in Copenhagen:
The collaboration with the IT-University in Copenhagen is moving forward.

We have experimented with getting embedded video-content and so far the results are great (except WARC-validation is not valid with re-visits)

We are working on finalizing a workflow from Webrecorder/ to Netarkivet. To be able to validate WARC-files correctly is a big part of getting the right level of preservation (we use JWAT for this). But it´s a bit complicated – see for instance this OPF blog by Remco van Veenendaal from Holland:

(How) are you validating WARC-files? And what is the future on this?


Our annual broad crawl will be lanched this tuesday, 6th of October. This will be our first broad crawl with the new NAS version including the official Heritrix 3 IIPC and we expect a better efficiency of the crawling. We have reduced to 2000 the maximal number of URLs per domain (instead of 2500 last year) and we expect to harvest between 110 and 115 TB.

Next week, we'll put in production a new version of our public GUI "Archives de l'internet" making available for the readers the video channels harvested in July. These channels cover the topics of covid19 outbreak and French local elections. The new version includes also a mechanism allowing to display Instagram account pages in our web archive and to browse the posts using Picuki directly from the Instagram page. Finally, this new version gives access to 3 titles of paid online newspapers, harvested with authentication.




Next meetings

  • November 3, 2020
  • December 8, 2020
  • January 5, 2021

Any other business?


  • No labels