Agenda for the joint BNF, ONB, SB, KB and BNE NetarchiveSuite tele-conference 2016-09-20, 13:00-14:00.
September 22-23. Søren, Colin, Bert will attend.
Topics, attendees: https://drive.google.com/drive/folders/0BwTi-qdD0KvdNEE4Qmpaa2dJeHM
Common questions/interests to bring?
On BnF side: some bugfixes:
Translation of new keys in French and German.
Considering the adoption of WARC revisit records for duplicates.
January 30th 2017 - February 1st 2017 - Vienna
NetarchiveSuite Curator Issues
Should we "reanimate" our curator roadmap/backlog, revise it and discuss it in Vienna?
Compression of the archive
Last not least
Last week we learned, that the ministry of culture wants KB and SB to merge: From January 2017 we will be “Nationalbiblioteket” with two locations, in Copenhagen and Aarhus
We are continuing to work on this year's broad crawl. We are preparing nas-preload, the tool used to combine the different sources into a single list to be loaded into NAS. This step also includes a DNS check to avoid slowing down the crawl with domains that do not have a DNS response. This year, in addition to excluding domains with no DNS we are also excluding those that give an "unknown" response, as from previous years we know there is generally no content on these domains. Overall the seed list will contain around 4.4 million active domains, and will have improved coverages of the different regional TLDs : .alsace, .paris; .bzh (for Brittany) and the French West Indies.
Turning to project crawls, the 2016 Olympiad is now over but our Olympics crawls are still running. The project, in line with the precedent collaborative collections documenting the 2014 Sotchi Winter Games and 2012 London Summer Games, involves seven curators from the Literature and Art department who work on the selection based on eight themes. Two crawls were planned, before and after the games, covering a list of 558 seeds. Concerning social media, we focused only on Twitter, with 447 French accounts or hashtags collected twice a day from the 4th to the 24th of August. These crawls will be complemented by one for the Paralympic games, to be launched on the 18th of September. We have also communicated our list of seeds for the worldwide collaborative collection led by the British Library for IIPC.