Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2111 Add WARC capability to the wayback package
  3. NAS-2070

WARC enable the dk.netarkivet.wayback.NetarchiveResourceStore

    XMLWordPrintable

Details

    • Sub-task
    • Resolution: Fixed
    • Major
    • I53, 4.0
    • None
    • Wayback
    • None
    • Hide

      Using NAS w/ WARC writing enabled, make an event harvest combining two sites, www.netarkivet.dk and www.pligtaflevering.dk

      Install standalone wayback, using proxy mode, and using NetarchiveResourceStore as ResourceStore, and pointing to settingsfile that uses LocalArcRepository.

      Generate CDX using wayback-tool 'warc-indexer'

      To be completed!

      Show
      Using NAS w/ WARC writing enabled, make an event harvest combining two sites, www.netarkivet.dk and www.pligtaflevering.dk Install standalone wayback, using proxy mode, and using NetarchiveResourceStore as ResourceStore, and pointing to settingsfile that uses LocalArcRepository. Generate CDX using wayback-tool 'warc-indexer' To be completed!

    Description

      We need to change the dk.netarkivet.wayback.NetarchiveResourceStore so it handles warc-records and not just arc-records fetched from the bitarchive.

      We have already made it possible to access WARC-data from wayback using the NetarchiveCacheResourceStore, but this class seems to be too inefficient for large scale use, as this latter class fetches whole archive-files from the bitarchive.

      Attachments

        Issue Links

          Activity

            People

              nicl@kb.dk Nicholas Clarke (Inactive)
              svc Søren Vejrup Carlsen (Inactive)
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: