Package dk.netarkivet.wayback.hadoop
Class CDXMapper
- java.lang.Object
-
- org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
-
- dk.netarkivet.wayback.hadoop.CDXMapper
-
public class CDXMapper extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
Hadoop Mapper for creating the CDX indexes. The input is a key (not used) and a Text line, which we assume is the path to an archive file. The output is an exit code (not used), and the generated CDX lines.
-
-
Constructor Summary
Constructors Constructor Description CDXMapper()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
map(org.apache.hadoop.io.LongWritable linenumber, org.apache.hadoop.io.Text archiveFilePath, org.apache.hadoop.mapreduce.Mapper.Context context)
Mapping method.
-
-
-
Constructor Detail
-
CDXMapper
public CDXMapper()
-
-
Method Detail
-
map
protected void map(org.apache.hadoop.io.LongWritable linenumber, org.apache.hadoop.io.Text archiveFilePath, org.apache.hadoop.mapreduce.Mapper.Context context) throws java.io.IOException, java.lang.InterruptedException
Mapping method.- Overrides:
map
in classorg.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Text>
- Parameters:
linenumber
- The linenumber. Is ignored.archiveFilePath
- The path to the archive file.context
- Context used for writing output.- Throws:
java.io.IOException
- If it fails to generate the CDX indexes.java.lang.InterruptedException
-
-