Class HadoopFileUtils


  • public class HadoopFileUtils
    extends Object
    Utilities for file actions related to Hadoop.
    • Constructor Detail

      • HadoopFileUtils

        public HadoopFileUtils()
    • Method Detail

      • cacheFile

        public static org.apache.hadoop.fs.Path cacheFile​(File file,
                                                          org.apache.hadoop.conf.Configuration conf)
                                                   throws IOException
        Given a file on a local file system, return a cached version of the same file on a hdfs file system.
        Parameters:
        file -
        Returns:
        a hdfs path to the file
        Throws:
        IOException - if caching not enabled or fails otherwise
      • cleanCache

        public static void cleanCache​(org.apache.hadoop.conf.Configuration configuration)
                               throws IOException
        Throws:
        IOException
      • createUniquePathInDir

        public static org.apache.hadoop.fs.Path createUniquePathInDir​(org.apache.hadoop.fs.FileSystem fileSystem,
                                                                      String dir,
                                                                      UUID uuid)
        Creates and returns a unique path under a given directory.
        Parameters:
        fileSystem - The used filesystem
        dir - A path to the parent directory to create the Path under
        uuid - The UUID used to name the Path
        Returns:
        A Hadoop path representing a unique file/directory or null if an error is encountered
      • initDir

        public static void initDir​(org.apache.hadoop.fs.FileSystem fileSystem,
                                   String hadoopDir)
                            throws IOException
        Initializes the given directory on the filesystem by deleting any existing file on the direct path and making all parent dirs in the directory path.
        Parameters:
        fileSystem - The filesystem on which the actions are executed.
        hadoopDir - The directory path to initialize.
        Throws:
        IOException - If any action on the filesystem fails.
      • makeLocalInputTempFile

        public static Path makeLocalInputTempFile()
      • replaceWithCachedPathIfEnabled

        public static org.apache.hadoop.fs.Path replaceWithCachedPathIfEnabled​(org.apache.hadoop.mapreduce.Mapper.Context context,
                                                                               org.apache.hadoop.fs.Path path)
                                                                        throws IOException
        Throws:
        IOException