After the dreadful attacks which occurred on the 7th and 9th of January in Paris and the events that followed, we decided to launch an emergency crawl in order to harvest web resources (news articles, blog posts, social media reactions, institutional websites…) related or reacting to them. We made an appeal to IIPC members and to our BnF network of librarians, asking them if they could help us in quickly gathering references to make the most complete and relevant seedlist possible. Due to the exceptional nature of the event, the scope and criteria of the selection were extended to an international scale and aimed to cover the different forms and diversity of the reactions. We received 2,480 URLs from eighteen different IIPC members and 1,740 URLs nominated by more than 70 BnF librarians. In addition to these selections, the already identified seed lists of French governmental, news, political, and activist websites have been specially harvested. And finally, our regular daily and weekly harvests of the principal French news sites, particularly relevant during those days, worked as usual.
Technically, the crawls were performed from 8th to 16th January 2015 and each website has been crawled at least once with a depth of page +1 click. During the same period, selected Twitter accounts and popular hashtags (as the now famous #JesuisCharlie) have been crawled four times a day. A total of 15.9 million URLs have been collected, for a total of 0.5 TB of data.
Any other business?