Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

- Improvement about HTML parser https://sbforge.org/jira/browse/NAS-2891

NAS problems with non-fetched image resolutions? (srcset data, responsive pages,

Discussion about topics for a future NAS workshop

...

Panel

The digital legal deposit service welcomes a new colleague, Florence Simonet as digital collections manager. She will particularly work on the harvests.

The end of the summer is marked by two important saving sites, before closure, projects.
Skyblog, which was the largest French blogging platform in the 2000s, closed to the public on August 21st. The BnF harvest began last week and covers more than 12.6 million blogs for a total of 1261 jobs.
The harvest is expected to last about 2 months and the estimated size is about 40 TB.

Furthermore, the Orange personal pages hosting service will close on September 5th. This is a website creation service linked to the telephone operator Orange. Harvesting tests will begin soon and should cover around 450 000 sites for a about 12 TB of data.

Like every year, we are currently preparing our upcoming broad crawl which will be launched in October 2023.

ONB

Panel

BNE

Panel

We have created a new collection about comic industry, we want to collect all creators and autor that disseminate their creations on Internet, most of then, only on Internet.

In July, we launched a new search engine that users allow to find website collect in diferent selective collection by title or keywords https://www.bne.es/es/colecciones/archivo-web-espanola/buscador

We would like to ask a question about NAS, is posible to get the number of job of a group of harvest? Sometime, we need extract the number of job from a whole collection in different harvest, and we don´t know how to do it.

We are working on upgrading the NAS to version 7.4 and need help to fix a problem. When storing warcs we receive a checksum error. We have consulted the documentation on replicas in archiving, but have not found where the problem is. In our installation we do not use replicas.


The error looks like:


Host: HNAS011.bne.local

Date: Tue Jul 11 13:49:30 CEST 2023

dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient.store(JMSArcRepositoryClient.java:277)

Could not store 'harvester_high/99340_1689058904228/metadata/99340-metadata-1.warc' after 3 attempts. Giving up.

The returned message 'ID:23688883-192.168.81.2(d8:2a:24:86:bd:65)-37144-1689076166675: To BNE_COMMON_THE_REPOS ReplyTo BNE_COMMON_THIS_REPOS_CLIENT_192_168_81_2_HCS_HIGH_H3_011 Error: Failure while trying to store ARC file: 99340-metadata-1.warc Arcfile: 99340-metadata-1.warc, precomputed checksum: 99cb38e2bdade211ff7ab596bbe051df' was not ok while waiting for reply on store of file 'harvester_high/99340_1689058904228/metadata/99340-metadata-1.warc' on attempt number 1 of 3. Error message was 'Failure while trying to store ARC file: 99340-metadata-1.warc'

The returned message 'ID:23688998-192.168.81.2(d8:2a:24:86:bd:65)-37144-1689076169036: To BNE_COMMON_THE_REPOS ReplyTo BNE_COMMON_THIS_REPOS_CLIENT_192_168_81_2_HCS_HIGH_H3_011 Error: Failure while trying to store ARC file: 99340-metadata-1.warc Arcfile: 99340-metadata-1.warc, precomputed checksum: 99cb38e2bdade211ff7ab596bbe051df' was not ok while waiting for reply on store of file 'harvester_high/99340_1689058904228/metadata/99340-metadata-1.warc' on attempt number 2 of 3. Error message was 'Failure while trying to store ARC file: 99340-metadata-1.warc'

The returned message 'ID:23689055-192.168.81.2(d8:2a:24:86:bd:65)-37144-1689076170195: To BNE_COMMON_THE_REPOS ReplyTo BNE_COMMON_THIS_REPOS_CLIENT_192_168_81_2_HCS_HIGH_H3_011 Error: Failure while trying to store ARC file: 99340-metadata-1.warc Arcfile: 99340-metadata-1.warc, precomputed checksum: 99cb38e2bdade211ff7ab596bbe051df' was not ok while waiting for reply on store of file 'harvester_high/99340_1689058904228/metadata/99340-metadata-1.warc' on attempt number 3 of 3. Error message was 'Failure while trying to store ARC file: 99340-metadata-1.warc'

KB-Sweden

Panel


Next meetings

...