dk.netarkivet.wayback.aggregator
Class IndexAggregator

java.lang.Object
  extended by dk.netarkivet.wayback.aggregator.IndexAggregator

public class IndexAggregator
extends java.lang.Object

Encapsulates the functionality for sorting and merging index files. Uses the Unix sort cmd for optimized sorting and file merging. Operations in this class are synchronized to avoid multiple jobs running at the same time (by the same object at least).


Constructor Summary
IndexAggregator()
           
 
Method Summary
 void mergeFiles(java.io.File[] files, java.io.File outputFile)
          Takes a list of sorted files and merges them.
 void sortAndMergeFiles(java.io.File[] files, java.io.File outputFile)
          Generates a sorted CDX index file based on the set of unsorted CDX input files.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

IndexAggregator

public IndexAggregator()
Method Detail

sortAndMergeFiles

public void sortAndMergeFiles(java.io.File[] files,
                              java.io.File outputFile)
Generates a sorted CDX index file based on the set of unsorted CDX input files.

The operation will not run on a folder which already has a process job running.

Parameters:
files - A list of the files to aggregate
outputFile - Name of the outputfile. In case of a empty filesNames array no outputFiles will be generated

mergeFiles

public void mergeFiles(java.io.File[] files,
                       java.io.File outputFile)
Takes a list of sorted files and merges them.

Parameters:
files - The files to merge.
outputFile - The resulting file containing total sorted set of index lines found in all the provided index files