We are still working on our move to NAS 5 and H3 (see above).
Our annual broad crawl was completed on December 5th, after 8 weeks. We gathered 90,4TB of data (compressed), which makes this crawl the biggest ever realised at the BnF. The infrastructure was stable and we didn't encounter any technical problems. We will be analysing the data more precisely on two subjects: regional domains and ebooks.
|In the meantime we have a president, so we keep running our presidental elections crawl until our president will start to work. That will be at the end of january.|
We created a new collection “women/gender” in cooperation with ONB’s women/gender-documentation department. The crawl was done in December.
Our next broad crawl will take place in 2017, we still run broad crawls every two years. We will work on a new concept for the webarchive regarding harvesting intervals etc.
In Production we had to switch to NAS 5.2.2 due to a bug in the DeduplicateToCDXAdapter (