|Our first broad crawl with NAS5 and H3 is finished! We crawled 101.55 TB in 6 weeks. We encountered 4 problems during this crawl:|
The access we enabled for users by last mid-summer is only available from the BNE and the regional libraries that asked for it. Although we disseminated this new service, so far, we don't have many consultations as the access is only available on-site and the interface (the OpenWayback by default) is not very friendly. We give open access (in internet) to the captures we have of a precise website
Last month, we successfully migrated all our web collections to the production environment of NAS 5. We are reasonably happy with the new environment.
Anyway, and despite the tests we run on the preproduction environment, we experienced some problems mainly related to the configuration of templates in NAS 5.
Frontpage+1 and frontpage+2 didn’t work as expected. Nevertheless we realized that some of the crawls ran very fast, but they stopped when encountered any slight problem and didn’t manage to finish.
Juan Carlos compared the NAS 5 templates with the ones in NAS 4 and adjusted some parameters. Apparently everything is working properly, crawls finish faster than before and harvest more objects. But the default template is not working yet and my IT colleagues are studying its configuration.
We wait for the system to be more stable before running the .gal domain crawl. We hope we can launch it before the end of the year.
The Library is mirroring its storage in another location of the Ministry of Education and Culture, so we'll have there a copy of our web archive in the next few months.