Child pages
  • 2017-03-07 Statusmeeting

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Status of the production sites

Netarkivet

Panel

NAS 5.2.2

Is now running on our production system. Metadata will be compressed. We do not run deduplication before the whole archive is compressed.

Social Media

The analysis how to crawl selected social media is ongoing. We are looking at Facebook, Twitter, YouTube, Vimeo, Instagram, Soundcloud, Reddit., Flickr, Vine, Pinterest and Linkedin. We already decidet not to collect Snapchat, Google+ or Bandcamp.

BCWeb

We are going to test BCWeb in order to find out, whether we can use it to get help from external curators.

Access

We have upgraded our citrix access software and solved problems with user categories.

Dialog with blocking Web hotels

We started a dialog with web hotels, who are blocking our harvester in order to find a solution that will make them stop the blockade.

Some statistics

Data amount (per 5.3.2017)

Total GB og TB i 1024 tal i arkivet: 793544 774

Thereof:

Number of GB/TB Broad crawls and ultra-big sites: 634638/619 and 62346/60

Number of GB/TB Selective crawls: 97198/94

Number of GB/TB Event crawls: 35387/34

(Exclusive metadata files and Test crawl files)

BnF

Panel
 

ONB

Panel
 

BNE

Panel
 

...