Class ProcessUtils


  • public class ProcessUtils
    extends java.lang.Object
    Various utilities for running processes -- not exactly Java's forte.
    • Constructor Summary

      Constructors 
      Constructor Description
      ProcessUtils()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.Object collectProcessOutput​(java.io.InputStream inputStream, int maxCollect, java.util.Set<java.lang.Thread> collectionThreads)
      Collect all output from an inputstream, up to maxCollect bytes, in an output object.
      static void discardProcessOutput​(java.io.InputStream inputStream)
      Read the output from a process.
      static int runProcess​(java.lang.String... programAndArgs)
      Runs an external process that takes no input, discarding its output.
      static int runProcess​(java.lang.String[] environment, java.lang.String... programAndArgs)
      Runs an external process that takes no input, discarding its output.
      static int runUnixSort​(java.io.File inputFile, java.io.File outputFile)
      Runs a system process (Unix sort) to sort a file.
      static int runUnixSort​(java.io.File inputFile, java.io.File outputFile, java.io.File tempDir, boolean crawllogSorting)
      Runs a system process (Unix sort) to sort a file.
      static java.lang.Integer waitFor​(java.lang.Process p, long maxWait)
      Wait for the end of a process, but only for a limited time.
      static void writeProcessOutput​(java.io.InputStream inputStream, java.io.File outputFile, java.util.Set<java.lang.Thread> collectionThreads)
      Collect all output from an inputstream, appending it to a file.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • runProcess

        public static int runProcess​(java.lang.String[] environment,
                                     java.lang.String... programAndArgs)
        Runs an external process that takes no input, discarding its output.
        Parameters:
        environment - An environment to run the process in (may be null)
        programAndArgs - The program and its arguments.
        Returns:
        The return code of the process.
      • runProcess

        public static int runProcess​(java.lang.String... programAndArgs)
        Runs an external process that takes no input, discarding its output. This is a convenience wrapper for runProcess(environment, programAndArgs)
        Parameters:
        programAndArgs - The program to run and its arguments
        Returns:
        The return code of the process.
      • discardProcessOutput

        public static void discardProcessOutput​(java.io.InputStream inputStream)
        Read the output from a process. Due to oddities in the Process handling, this has to be done char by char. This method just implements a consumer thread to eat the output of a process and so prevent blocking.
        Parameters:
        inputStream - A stream to read up to end of file. This stream is closed at some point in the future, but not necessarily before this method returns.
      • collectProcessOutput

        public static java.lang.Object collectProcessOutput​(java.io.InputStream inputStream,
                                                            int maxCollect,
                                                            java.util.Set<java.lang.Thread> collectionThreads)
        Collect all output from an inputstream, up to maxCollect bytes, in an output object. This will eventually close the given InputStream, but not necessarily before the method returns. The thread created is placed in a thread set, and should be removed once all output has been collected. While only a limited amount may be written to the output object, the entire output will be read from the inputStream unless the thread or the inputStream is destroyed first.
        Parameters:
        inputStream - The inputstream to read contents from
        maxCollect - The maximum number of bytes to collect, or -1 for no limit
        collectionThreads - Set of threads that concurrently collect output
        Returns:
        An object that collects the output. Once the thread returned is finished, the object will no longer be written to. The collected output can be retrieved with the toString method.
      • writeProcessOutput

        public static void writeProcessOutput​(java.io.InputStream inputStream,
                                              java.io.File outputFile,
                                              java.util.Set<java.lang.Thread> collectionThreads)
        Collect all output from an inputstream, appending it to a file. This will eventually close the given InputStream, but not necessarily before the method returns. The thread created is placed in a thread set, and should be removed once all output has been collected.
        Parameters:
        inputStream - The inputstream to read contents from
        outputFile - The file that output should be appended to.
        collectionThreads - Set of threads that concurrently collect output
      • waitFor

        public static java.lang.Integer waitFor​(java.lang.Process p,
                                                long maxWait)
        Wait for the end of a process, but only for a limited time. This method takes care of the ways waitFor can get interrupted.
        Parameters:
        p - Process to wait for
        maxWait - The maximum number of milliseconds to wait for the process to exit.
        Returns:
        Exit value for process, or null if the process didn't exit within the expected time.
      • runUnixSort

        public static int runUnixSort​(java.io.File inputFile,
                                      java.io.File outputFile)
        Runs a system process (Unix sort) to sort a file.
        Parameters:
        inputFile - the input file.
        outputFile - the output file.
        Returns:
        the process exit code.
      • runUnixSort

        public static int runUnixSort​(java.io.File inputFile,
                                      java.io.File outputFile,
                                      java.io.File tempDir,
                                      boolean crawllogSorting)
        Runs a system process (Unix sort) to sort a file.
        Parameters:
        inputFile - the input file.
        outputFile - the output file.
        tempDir - the directory where to store temporary files (null for default system temp).
        crawllogSorting - Should we sort crawllog style ("-k 4b") or not
        Returns:
        the process exit code.