Page tree

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


  • We are still running our ongoing selective crawls (the biggest one is annual focused on big hosts and domains, social movements).
  • We installed Java 1.7.79 on some harvesters within a specific channel to solve HTTPS problems for specific crawls (news and official publications).
  • We are still working on our Corpus project.


  • Please see doodle poll to select the dates for the NAS meeting in Vienna (25.-27.1. or 30.1.-1.2.)
  • Event crawl about presidential elections is still ongoing. One of the political parties is blocking our crawlers. We used to archive the website. We still need to find a way to integrate the content into our webarchive.
  • We completed an important milestone: the new user interface for the inhouse webarchive terminals has been launched. It includes a partial fulltext search, screenshot preview and uses Open Wayback. The next step will be going online, archived content will not be accessible, search-function only.


  • The first .es domain crawl is running since April 4th. Our engineers estimate it will last until the end of July or mid August.
  • We are trying to connect BCWeb to NAS development environment to give access to the web curators from our regional libraries.
  • As our General Elections are going to be repeated in June 26th, we didn't close yet the General Elections event crawl that started on December 2015. Web curators from the regional libraries are nominating seeds for this collection.
  • At the moment, the web archiving team is even smaller than it was. If the librarians were two (Sole and me), I'm now on my own, because Sole moved to another position at the Library. We are trying to recruit more people for the team, but so far the situation is even worse than it used to be.