Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2703

Patch H3 to support srcset attribute

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Fixed
    • Major
    • 5.5
    • None
    • Heritrix 3
    • None

    Description

      There is an IA patch to enable harvesting of srcset with heritrix: https://github.com/internetarchive/heritrix3/pull/179/files

      This is a major issue for netarkivet.dk affecting some of the largest and most important sites: https://sbprojects.statsbiblioteket.dk/pages/viewpage.action?pageId=31491867

      (See also Netarkivet issue
      https://sbprojects.statsbiblioteket.dk/jira/browse/NARK-1554 )

      The task consists of patching our h3 with the relevant IA commit, test that it makes a difference to srcset harvest, then deploy the new patched h3 to nexus, and finally test that we can build and deploy NAS with the new patched h3.

      Strategies for patching:

      1. Create a patch in github and apply it to our code, or
      2. Add IA heritrix as an upstream repository to a local git h3 checkout, then cherry pick the relevant commit.

      Attachments

        Issue Links

          Activity

            People

              jrg Jeppe Ravn-Grove (Inactive)
              csr Colin Rosenthal
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: