Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-2038

indexserver tries to generate deduplication indexes based on unfinished selective harvests

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • GUI, IndexServer
    • None

    Description

      Sometimes the scheduling of a specific selective harvest is so, that one harvest is not finished before the next one is scheduled.

      And so a deduplication-index is requested partially based on an unfinished harvest X.
      This indexer then fails partially to create index, because it cannot retrieve metadata for harvest X.

      We then see warnings like this in IndexApplication log:

      Feb 14, 2012 2:44:42 PM dk.netarkivet.archive.indexserver.RawMetadataCache 
      cacheData
      INFO: No data found for job '143450' for 'crawllog' in local bitarchive 
      'KB'. Trying other replicas.
      
      
      Feb 14, 2012 2:44:43 PM dk.netarkivet.archive.indexserver.RawMetadataCache 
      cacheData
      WARNING: No data found for job '143450' for 'crawllog' in any replica.
      
      
      Feb 14, 2012 2:44:43 PM dk.netarkivet.archive.indexserver.distribute.IndexRequestServer 
      doGenerateIndex
      WARNING: Failed generating index of type 'DEDUP_CRAWL_LOG' for the jobs 
      [143449,143450]. Missing data for jobs [143450].
      
      

      Attachments

        Activity

          People

            svc Søren Vejrup Carlsen (Inactive)
            svc Søren Vejrup Carlsen (Inactive)
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: