java.lang.Object
- dk.netarkivet.common.utils.hadoop.MetadataExtractionStrategy

All Implemented Interfaces:

HadoopJobStrategy
```
public class MetadataExtractionStrategy
extends java.lang.Object
implements HadoopJobStrategy
```
Strategy to give a HadoopJob when wanting to extract selected content from metadata files matching specific URL- and MIME-patterns. The mapper expects the used Configuration to have these patterns set before use. Otherwise, it will use all-matching patterns. This type of job is the Hadoop counterpart to running GetMetadataArchiveBatchJob.

Constructor Summary

Constructors
Constructor Description

MetadataExtractionStrategy(long jobID, org.apache.hadoop.fs.FileSystem fileSystem)
Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`org.apache.hadoop.fs.Path`	`createJobInputFile(java.util.UUID uuid)`	Create the job input file with name from a uuid.
`org.apache.hadoop.fs.Path`	`createJobOutputDir(java.util.UUID uuid)`	Create the job output directory with name from a uuid.
`java.lang.String`	`getJobType()`	Return a string specifying which kind of job is being run.
`int`	`runJob(org.apache.hadoop.fs.Path jobInputFile, org.apache.hadoop.fs.Path jobOutputDir)`	Runs a Hadoop job (HadoopJobTool) according to the specification of the used strategy.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - MetadataExtractionStrategy
```
public MetadataExtractionStrategy(long jobID,
                                  org.apache.hadoop.fs.FileSystem fileSystem)
```
    Constructor.
    
    Parameters:
    
    jobID - The ID for the job.
    
    fileSystem - The Hadoop FileSystem used.
- Method Detail
  - runJob
```
public int runJob(org.apache.hadoop.fs.Path jobInputFile,
                  org.apache.hadoop.fs.Path jobOutputDir)
```
    Description copied from interface: HadoopJobStrategy
    
    Runs a Hadoop job (HadoopJobTool) according to the specification of the used strategy.
    
    Specified by:
    
    runJob in interface HadoopJobStrategy
    
    Parameters:
    
    jobInputFile - The Path specifying the job's input file.
    
    jobOutputDir - The Path specifying the job's output directory.
    
    Returns:
    
    An exit code for the job.
  - createJobInputFile
```
public org.apache.hadoop.fs.Path createJobInputFile(java.util.UUID uuid)
```
    Description copied from interface: HadoopJobStrategy
    
    Create the job input file with name from a uuid.
    
    Specified by:
    
    createJobInputFile in interface HadoopJobStrategy
    
    Parameters:
    
    uuid - The UUID to create a unique name from.
    
    Returns:
    
    Path specifying where the input file is located.
  - createJobOutputDir
```
public org.apache.hadoop.fs.Path createJobOutputDir(java.util.UUID uuid)
```
    Description copied from interface: HadoopJobStrategy
    
    Create the job output directory with name from a uuid.
    
    Specified by:
    
    createJobOutputDir in interface HadoopJobStrategy
    
    Parameters:
    
    uuid - The UUID to create a unique name from.
    
    Returns:
    
    Path specifying where the output directory is located.
  - getJobType
```
public java.lang.String getJobType()
```
    Description copied from interface: HadoopJobStrategy
    
    Return a string specifying which kind of job is being run.
    
    Specified by:
    
    getJobType in interface HadoopJobStrategy
    
    Returns:
    
    String specifying the job's type.

Class MetadataExtractionStrategy

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

MetadataExtractionStrategy

Method Detail

runJob

createJobInputFile

createJobOutputDir

getJobType