First, the Winter Olympic Games harvest ended at the end of February and a new crawl dedicated to the Winter Paralympics has been launched last week. About 14 million URLs were collected in February, including almost 1.4 million Twitter URLs for a total of 0.57TB.
We also decided to make another attempt to collect Instagram in 2022. After several tries, we succeeded. 73 Instagram accounts on the theme of the Olympics were collected, that is to say about 7 000 URLs.
Finally, last December we opened a participation form until the end of January so that the public could indicate sites to be added to the Intelligence Artificial harvest.
We will run our third annual broad crawl of e-serialsin open access next month. This is the third year in a row that we launched it. We obtain the url list from our catalogue where all the serials that request ISSN are catalogued. Every year we enhance the list of urls thanks to the quality control we carry out after each crawl. We will harvest about 9.000 websites this year.
The Library is involved in a renovation of its technological structure, so we are studying when we will be able to carry out the annual broad crawl. It will probably take place before summer. The renovation involves the acquisition of new servers and solves some problems we had last year with the storage arrays. Once the new technological structure has stabilized our next step is the NAS upgrade, that we think we can tackle next May. We also want to start to index the current crawls in SolrWayback and we will address later the indexing of everything harvested prior to 2022.