Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2690

The function "Browse only relevant crawl-log lines for this domain" is faulty

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 5.2.2, 5.3.1
    • Fix Version/s: 5.4
    • Component/s: GUI
    • Labels:
      None
    • Verification:
      Hide

      How to test this.
      Construct a tool to fetch relevant crawlog-lines from a local metadata warcfile
      (args: domain metadata warcfile)
      Fetch a metadata warcfile with a large crawllog.

      Test that the new domain specific regexp returns the correct lines

      Show
      How to test this. Construct a tool to fetch relevant crawlog-lines from a local metadata warcfile (args: domain metadata warcfile) Fetch a metadata warcfile with a large crawllog. Test that the new domain specific regexp returns the correct lines

      Description

      The code in harvester/qa-gui/src/main/webapp/QA-searchcrawllog.jsp is faulty:

      if (regexp != null && regexp.length() != 0 ) {
                  crawlLogExtract = Reporting.getCrawlLoglinesMatchingRegexp(jobid, regexp);
      } else { // use 'domain' as the regular expression
                  regexp = ".*" + domain.replaceAll("\\.", "\\\\.") + ".*";
                  crawlLogExtract = Reporting.getCrawlLoglinesMatchingRegexp(jobid, regexp);
              }
      

      The regexp in the else logic is used for the "Browse only..." functionality

        Attachments

          Activity

            People

            • Assignee:
              svc Søren Vejrup Carlsen (Inactive)
              Reporter:
              svc Søren Vejrup Carlsen (Inactive)
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: