Class IndexAggregator


  • public class IndexAggregator
    extends Object
    Encapsulates the functionality for sorting and merging index files. Uses the Unix sort cmd for optimized sorting and file merging. Operations in this class are synchronized to avoid multiple jobs running at the same time (by the same object at least).
    • Constructor Detail

      • IndexAggregator

        public IndexAggregator()
    • Method Detail

      • sortAndMergeFiles

        public void sortAndMergeFiles​(File[] files,
                                      File outputFile)
        Generates a sorted CDX index file based on the set of unsorted CDX input files.

        The operation will not run on a folder which already has a process job running.

        Parameters:
        files - A list of the files to aggregate
        outputFile - Name of the output file. In case of a empty filesNames array no outputFiles will be generated
      • mergeFiles

        public void mergeFiles​(File[] files,
                               File outputFile)
        Takes a list of sorted files and merges them.
        Parameters:
        files - The files to merge.
        outputFile - The resulting file containing total sorted set of index lines found in all the provided index files