Skip to end of metadata
Go to start of metadata

The 2017 NAS workshop will take place on April 26-28 and will be hosted by National Library of Austria, Vienna

Location:

Address: Austrian National Library / Training Department, Augasse 2-6, 1090 Vienna, Attention - not on historic library premises in the city center!

Map: https://www.onb.ac.at/bibliothek/ausbildung/neuer-standort/

Public transport: metro lines U4 and U6, tram D

Route planner: http://www.wienerlinien.at/eportal3/ep/channelView.do/pageTypeId/66533/channelId/-48703

 

Hotels nearby:

Arthotel ANA Katharina, http://ana-hotels.de/katharina (2 min walk)

ibis Styles Wien City, http://www.accorhotels.com/de/hotel-9034-ibis-styles-wien-city/index.shtml (5 min walk)

Hotel Boltzmann, http://www.hotelboltzmann.at/index.php (15-20 min walk)


Participants:

Organization

Technical

Curator 

Netarkivet

Colin

Tue

Stephen

Søren

Sabine

ONB

Michaela

Andreas

BnF

Sara 

Thomas

Géraldine

BNE

Juan Carlos

Fernando

Mar

Topics to be discussed:

NAS5 / Heritrix 3 - technicalHeritrix 3 - curatorial
  • State of the art of current developments
  • Upcoming developments
  • Introduce a multiple crawlers approach into NAS

  • Videos/social media harvesting
  • What CDX format are you using today and plan to support within next year?

  • Which version of (Open)Wayback are you using today and what do think about the future development of OpenWyback?

  • Which social media can you archive today?

  • How to consolidate crawl.log and frontier search features in NetarchiveSuite?
  • BNF's freetext search (better than KB DK's) - anything to share with the community?
  • Others ?
  • Feedback on using NAS 5 and Heritrix 3
  • Missing features
  • Priorities for future developments
  • Is it possible to connect other tools than Heritrix to NAS (tools that can produce WARC files and capture content, which Heritrix is not able to catch) If so, which tools to we want to use?
  • Revival and update of the curator roadmap
  • Harvest the electoral web: selection, harvest parameters
  • Experiences with harvesting pages with login content (pay walls)
  • Experiences with harvesting images embedded in javascript (and replay them in the archive)
  • Exchange of experiences with documentation of the crawls (in and outside NAS)
  • Others ?

DRAFT agenda

Schedule for Day 1 (12:30-17:30)

12:30 - 14:00 Welcome, sandwiches and coffee

14:00 - 14:30 Workshop introduction (Michaela, Sara)

14:30 - 16:00 Institution updates and plans for 2017

  • Update from ONB (Michaela)
  • Update from BNE (Mar, Juan Carlos)
  • Update from BnF (Géraldine)
  • Update from KB (Sabine, Tue)

16:00 - 16:15 Coffee break

16:15 - 17:30 NetarchiveSuite 5.3: demo and discussion of latest features (Colin, Sara)

19:30 Dinner

Schedule for Day 2 (9:00-17:00)

09:00 - 11:30

Technical track:

  • Discuss technical challenges in NAS 5 and Heritrix 3 installation
  • Discuss NAS development cycles and how to best contribute to the code
  • Discuss future developements regarding H3 integration and others
  • Sum up

Curator track:

  • Share experiences with crawl documentation in and outside NAS
  • Share experiences in using NAS 5 and Heritrix 3
  • Establish a list of NAS missing features
  • Sum up, define priorities

10:30 - 10:45 Coffee break

11:30 - 12:30 Common track: presentation of curators priorities, update of NAS curator road map: https://sbforge.org/jira/projects/NASC/summary/statistics

12:30 - 14:00 Lunch

14:00 - 15:30

Common track: Harvesting videos and social medias:

  • State of the art and considerations in the different institutions
  • Discuss possible external tool integration

15:30 - 15:45 Coffee break

15:45 - 17:00

Technical track:

  • Further technical considerations
  • Discuss or make a proof of concept

Curator track:

  • Share experiences with harvesting content beyond pay walls (Andreas, Géraldine, others?)

19:30 Dinner (optional)

Schedule for Day 3 (9:00-14:00)

09:00 - 10:30

Technical track:

  • Poc to follow
  •  Discuss access tools (CDX, Wayback, Full text search)

Curator track:

  • Collection strategies
  • BCweb: demo, use at BNE and BnF, further collaborations

10:30 - 10:45 Coffee break

10:45 - 12:30 Tracks sum-up, community next steps

12:30 - 14:00 Lunch and goodbye

 

 

  • No labels