Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The 2017 NAS workshop will take place on April 26-28 and will be hosted by National Library of Austria, Vienna

Location:

Address: Austrian National Library / Training Department, Augasse 2-6, 1090 Vienna, Attention - not on historic library premises in the city center!

...

Organization

Technical

Curator 

Netarkivet

  

ONB

 

 

BnF

Sara

Thomas or Bert

Géraldine

BNE

2

1
KB Sweden?  

Table of Contents

Topics to be discussed:

NAS5 / Heritrix 3 - technicalHeritrix 3 - curatorial
  • State of the art of current developments
  • Upcoming developments
  • Introduce a multiple crawlers approach into NAS

  • Videos/social media harvesting
  • What CDX format are you using today and plan to support within next year?

  • Which version of (Open)Wayback are you using today and what do think about the future development of OpenWyback?

  • Which social media can you archive today?

  • How to consolidate crawl.log and frontier search features in NetarchiveSuite?
  • BNF's freetext search (better than KB DK's) - anything to share with the community?
  • Others ?
  • Feedback on using NAS 5 and Heritrix 3
  • Missing features
  • Priorities for future developments
  • Is it possible to connect other tools than Heritrix to NAS (tools that can produce WARC files and capture content, which Heritrix is not able to catch) If so, which tools to we want to use?
  • Revival and update of the curator roadmap
  • Harvest the electoral web: selection, harvest parameters
  • Experiences with harvesting pages with login content (pay walls)
  • Experiences with harvesting images embedded in javascript (and replay them in the archive)
  • Exchange of experiences with documentation of the crawls (in and outside NAS)
  • Others ?

...

DRAFT agenda

Schedule for Day 1 (

...

12:

...

30-17:30)

12:30 - 14:00 Welcome, sandwiches and coffee

14:00 - 14:30 Workshop introduction (Michaela Mayr, Sara Aubry)

Location: 

14:30 - 16:00 Institution updates and plans for 2017

  • Update from ONB (Michaela Mayr)
  • Update from BNE (Mar Pérez Morillo)
  • Update from BnF (Géraldine Camile, Sara Aubry)
  • Update from KB (Sabine Schostag, Tue Hejlskov Larsen)

16:00 - 16:15 Coffee break

16:15 - 17:30 NetarchiveSuite 5.3: demo and discussion of latest features

19:30 Dinner

Schedule for Day 2 (9:00-17:00)

Location09:00 - 12:30

Technical track:

  • Discuss technical challenges in NAS 5 and Heritrix 3 installation
  • Discuss NAS development cycles and how to best contribute to the code
  • Discuss future developements regarding H3 integration and others
  • Sum up

Curator track:

  • Share experiences in using NAS 5 and Heritrix 3
  • Establish a list of NAS missing features
  • Share experiences with crawl documentation (in and outside NAS)
  • Update NAS curator road map, define priorities

10:30 - 10:45 Coffee break

12:30 - 14:00 Lunch

14:00 - 15:30

Common track: Harvesting videos and social medias:

  • State of the art and considerations in the different institutions
  • Discuss possible external tool integration

15:30 - 15:45 Coffee break

15:45 - 17:00

Technical track:

  • Further technical considerations
  • Discuss or make a proof of concept

Curator track:

  • Share experiences with harvesting content beyond pay walls

19:30 Dinner (optional)

Schedule for Day 3 (9:00-

...

14:00)

Location09:00 - 10:30

Technical track:

  • Poc to follow
  • AND/OR Discuss access tools (CDX, Wayback, Full text search)

Curator track:

  • BCweb: demo, use at BNE and BnF, further collaborations

10:30 - 10:45 Coffee break

10:45 - 12:30 Tracks sum-up, community next steps

12:30 - 14:00 Lunch and goodbye