Class GetMetadataMapper
- java.lang.Object
-
- org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
-
- dk.netarkivet.common.utils.hadoop.GetMetadataMapper
-
public class GetMetadataMapper extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
Hadoop Mapper for extracting metadata entries from metadata files.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
MIME_PATTERN
static java.lang.String
URL_PATTERN
-
Constructor Summary
Constructors Constructor Description GetMetadataMapper()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
map(org.apache.hadoop.io.LongWritable lineNumber, org.apache.hadoop.io.Text filePath, org.apache.hadoop.mapreduce.Mapper.Context context)
Mapping method.protected void
setup(org.apache.hadoop.mapreduce.Mapper.Context context)
Setup method that is provided by default for a Hadoop Mapper.
-
-
-
Field Detail
-
URL_PATTERN
public static final java.lang.String URL_PATTERN
- See Also:
- Constant Field Values
-
MIME_PATTERN
public static final java.lang.String MIME_PATTERN
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
GetMetadataMapper
public GetMetadataMapper()
-
-
Method Detail
-
setup
protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context) throws java.io.IOException, java.lang.InterruptedException
Setup method that is provided by default for a Hadoop Mapper. Initializes the patterns for matching the metadata records.- Overrides:
setup
in classorg.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
- Parameters:
context
- The job context. Used for getting the provided Configuration.- Throws:
java.io.IOException
- Thrown by the super class' setup method.java.lang.InterruptedException
- Thrown by the super class' setup method.
-
map
protected void map(org.apache.hadoop.io.LongWritable lineNumber, org.apache.hadoop.io.Text filePath, org.apache.hadoop.mapreduce.Mapper.Context context)
Mapping method.- Overrides:
map
in classorg.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
- Parameters:
lineNumber
- The current line number of the input file (is ignored).filePath
- The path to the input file.context
- Context used for writing output.
-
-