Uploaded image for project: 'NetarchiveSuite'
  1. NetarchiveSuite
  2. NAS-1995

External sort of index saturates disk

    XMLWordPrintable

Details

    • BNF
    • Rough
    • Hide

      Easiest to verify by running the Indexserver on a machine with a small /tmp partition (500 MB)
      And then change the settings of the IndexServerApplication to use a LocalArcrepositoryClient
      associated with an medium archive (69 Gb).
      Then send a Deduprequest using the program trunk/tests/SendDedupRequestToIndexServer.
      This request should fail due to lack of space in /tmp

      Then change the IndexApplication settings to override the
      setting "settings.common.unixSort.useCommonTempDir " to true.

      Resend the previous request, and now the indexing request should complete successfully.

      Show
      Easiest to verify by running the Indexserver on a machine with a small /tmp partition (500 MB) And then change the settings of the IndexServerApplication to use a LocalArcrepositoryClient associated with an medium archive (69 Gb). Then send a Deduprequest using the program trunk/tests/SendDedupRequestToIndexServer. This request should fail due to lack of space in /tmp Then change the IndexApplication settings to override the setting "settings.common.unixSort.useCommonTempDir " to true. Resend the previous request, and now the indexing request should complete successfully.

    Description

      Our production engineers reported that the index generation for our semestrial crawl had saturated the disk space for the system temp.

      We had configured the common.settings.tempDir property to a special big partition, but this setting seemed not to have any effect in this case.

      Here is the stack trace we obtained:

      Nov 7, 2011 4:28:24 PM dk.netarkivet.archive.indexserver.distribute.IndexRequestServer doGenerateIndex
      WARNING: Unable to generate index for jobs [823,822,825,824]
      dk.netarkivet.common.exceptions.IOFailure: Error code 2 sorting crawl log '/data/PROD_CIRCUIT_3.1.0/cache/crawllog/crawllog-823-cache'
      at dk.netarkivet.common.utils.FileUtils.sortCrawlLog(FileUtils.java:1005)
      at dk.netarkivet.archive.indexserver.CrawlLogIndexCache.getSortedCrawlLog(CrawlLogIndexCache.java:244)
      at dk.netarkivet.archive.indexserver.CrawlLogIndexCache.indexFile(CrawlLogIndexCache.java:179)
      at dk.netarkivet.archive.indexserver.CrawlLogIndexCache.combine(CrawlLogIndexCache.java:146)
      at dk.netarkivet.archive.indexserver.CombiningMultiFileBasedCache.cacheData(CombiningMultiFileBasedCache.java:80)
      at dk.netarkivet.archive.indexserver.CombiningMultiFileBasedCache.cacheData(CombiningMultiFileBasedCache.java:48)
      at dk.netarkivet.archive.indexserver.FileBasedCache.cache(FileBasedCache.java:167)
      at dk.netarkivet.archive.indexserver.distribute.IndexRequestServer.doGenerateIndex(IndexRequestServer.java:157)
      at dk.netarkivet.archive.indexserver.distribute.IndexRequestServer.access$000(IndexRequestServer.java:58)
      at dk.netarkivet.archive.indexserver.distribute.IndexRequestServer$1.run(IndexRequestServer.java:137)

      A little bit of investigation revealed that the IndexServer process had children process running the unix sort command, and this would by default use the system /temp, and cause the saturation.

      The suggested fix is to add the '-T <value of common.settings/tempDir" parameter when building the sort command within application code.

      Attachments

        Issue Links

          Activity

            People

              svc Søren Vejrup Carlsen (Inactive)
              ngiraud Nicolas Giraud (Inactive)
              Colin Rosenthal Colin Rosenthal
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 2h
                  2h
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h