Page tree

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.



We launched step 2 of our 3rd broad crawl this year (with a limit of 14GB per domain) on 2018/10/23

We looked at all our open issues and grouped them thematically:

  • Harvesting problems
  • Replay problems
  • Improving existing functionalities
  • New functionalities
  • Automatization of operations, which are solved manually at the moment
  • Will be solved  by existing projects

The aim was to find the most urgent problems, which we cannot solve without developers help


We are working on the implementation of SOLR wayback to search in Netarchive. By now SOLR Wayback still is a protoype. Amongst others we need to clarify  UX, security questions, how to do the logging and to chose a platform for the user access.

But the display of the results is much better than Blacklight

We are working on a procedure for a new type of usage from the archive: data extraction for research project from the archive. The data to be extracted are determined by a search string – hopefully this would be rather easy with SOLR-wayback

We are going to prepare a mini-event harvest “Week 46”. In week 46 the Royal Library collects local broadcast stations’ (both radio and television) productions. For a couple of years ago Netarchive started collecting there home pages.





  • We are still running our domain crawl for this year and we are in the last quarter of the crawl. We are expecting the end in a few weeks.





Next meetings