Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2690

The function "Browse only relevant crawl-log lines for this domain" is faulty

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 5.4
    • 5.2.2, 5.3.1
    • GUI
    • None
    • Hide

      How to test this.
      Construct a tool to fetch relevant crawlog-lines from a local metadata warcfile
      (args: domain metadata warcfile)
      Fetch a metadata warcfile with a large crawllog.

      Test that the new domain specific regexp returns the correct lines

      Show
      How to test this. Construct a tool to fetch relevant crawlog-lines from a local metadata warcfile (args: domain metadata warcfile) Fetch a metadata warcfile with a large crawllog. Test that the new domain specific regexp returns the correct lines

    Description

      The code in harvester/qa-gui/src/main/webapp/QA-searchcrawllog.jsp is faulty:

      if (regexp != null && regexp.length() != 0 ) {
                  crawlLogExtract = Reporting.getCrawlLoglinesMatchingRegexp(jobid, regexp);
      } else { // use 'domain' as the regular expression
                  regexp = ".*" + domain.replaceAll("\\.", "\\\\.") + ".*";
                  crawlLogExtract = Reporting.getCrawlLoglinesMatchingRegexp(jobid, regexp);
              }
      

      The regexp in the else logic is used for the "Browse only..." functionality

      Attachments

        Activity

          People

            svc Søren Vejrup Carlsen (Inactive)
            svc Søren Vejrup Carlsen (Inactive)
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: