Class HadoopFileUtils
- java.lang.Object
-
- dk.netarkivet.common.utils.hadoop.HadoopFileUtils
-
public class HadoopFileUtils extends Object
Utilities for file actions related to Hadoop.
-
-
Constructor Summary
Constructors Constructor Description HadoopFileUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static org.apache.hadoop.fs.Path
cacheFile(File file, org.apache.hadoop.conf.Configuration conf)
Given a file on a local file system, return a cached version of the same file on a hdfs file system.static void
cleanCache(org.apache.hadoop.conf.Configuration configuration)
static org.apache.hadoop.fs.Path
createUniquePathInDir(org.apache.hadoop.fs.FileSystem fileSystem, String dir, UUID uuid)
Creates and returns a unique path under a given directory.static void
initDir(org.apache.hadoop.fs.FileSystem fileSystem, String hadoopDir)
Initializes the given directory on the filesystem by deleting any existing file on the direct path and making all parent dirs in the directory path.static Path
makeLocalInputTempFile()
static org.apache.hadoop.fs.Path
replaceWithCachedPathIfEnabled(org.apache.hadoop.mapreduce.Mapper.Context context, org.apache.hadoop.fs.Path path)
-
-
-
Method Detail
-
cacheFile
public static org.apache.hadoop.fs.Path cacheFile(File file, org.apache.hadoop.conf.Configuration conf) throws IOException
Given a file on a local file system, return a cached version of the same file on a hdfs file system.- Parameters:
file
-- Returns:
- a hdfs path to the file
- Throws:
IOException
- if caching not enabled or fails otherwise
-
cleanCache
public static void cleanCache(org.apache.hadoop.conf.Configuration configuration) throws IOException
- Throws:
IOException
-
createUniquePathInDir
public static org.apache.hadoop.fs.Path createUniquePathInDir(org.apache.hadoop.fs.FileSystem fileSystem, String dir, UUID uuid)
Creates and returns a unique path under a given directory.- Parameters:
fileSystem
- The used filesystemdir
- A path to the parent directory to create the Path underuuid
- The UUID used to name the Path- Returns:
- A Hadoop path representing a unique file/directory or null if an error is encountered
-
initDir
public static void initDir(org.apache.hadoop.fs.FileSystem fileSystem, String hadoopDir) throws IOException
Initializes the given directory on the filesystem by deleting any existing file on the direct path and making all parent dirs in the directory path.- Parameters:
fileSystem
- The filesystem on which the actions are executed.hadoopDir
- The directory path to initialize.- Throws:
IOException
- If any action on the filesystem fails.
-
makeLocalInputTempFile
public static Path makeLocalInputTempFile()
-
replaceWithCachedPathIfEnabled
public static org.apache.hadoop.fs.Path replaceWithCachedPathIfEnabled(org.apache.hadoop.mapreduce.Mapper.Context context, org.apache.hadoop.fs.Path path) throws IOException
- Throws:
IOException
-
-