java.lang.Object
- dk.netarkivet.viewerproxy.webinterface.hadoop.CrawlLogExtractionStrategy

All Implemented Interfaces:

HadoopJobStrategy
```
public class CrawlLogExtractionStrategy
extends Object
implements HadoopJobStrategy
```
Strategy to give a HadoopJob when wanting to extract crawl log lines matching some regex from metadata files. The mapper expects the used Configuration to have this regex set. Otherwise, an all-matching pattern will be used. This type of job is the Hadoop counterpart to running CrawlLogLinesMatchingRegexp.

Constructor Summary

Constructors
Constructor Description

CrawlLogExtractionStrategy(long jobID, org.apache.hadoop.fs.FileSystem fileSystem)
Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`org.apache.hadoop.fs.Path`	`createJobInputFile(UUID uuid)`	Create the job input file with name from a uuid.
`org.apache.hadoop.fs.Path`	`createJobOutputDir(UUID uuid)`	Create the job output directory with name from a uuid.
`String`	`getJobType()`	Return a string specifying which kind of job is being run.
`int`	`runJob(org.apache.hadoop.fs.Path jobInputFile, org.apache.hadoop.fs.Path jobOutputDir)`	Runs a Hadoop job (HadoopJobTool) according to the specification of the used strategy.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - CrawlLogExtractionStrategy
```
public CrawlLogExtractionStrategy(long jobID,
                                  org.apache.hadoop.fs.FileSystem fileSystem)
```
    Constructor.
    
    Parameters:
    
    jobID - The ID for the job.
    
    fileSystem - The Hadoop FileSystem used.
- Method Detail
  - runJob
```
public int runJob(org.apache.hadoop.fs.Path jobInputFile,
                  org.apache.hadoop.fs.Path jobOutputDir)
```
    Description copied from interface: HadoopJobStrategy
    
    Runs a Hadoop job (HadoopJobTool) according to the specification of the used strategy.
    
    Specified by:
    
    runJob in interface HadoopJobStrategy
    
    Parameters:
    
    jobInputFile - The Path specifying the job's input file.
    
    jobOutputDir - The Path specifying the job's output directory.
    
    Returns:
    
    An exit code for the job.
  - createJobInputFile
```
public org.apache.hadoop.fs.Path createJobInputFile(UUID uuid)
```
    Description copied from interface: HadoopJobStrategy
    
    Create the job input file with name from a uuid.
    
    Specified by:
    
    createJobInputFile in interface HadoopJobStrategy
    
    Parameters:
    
    uuid - The UUID to create a unique name from.
    
    Returns:
    
    Path specifying where the input file is located.
  - createJobOutputDir
```
public org.apache.hadoop.fs.Path createJobOutputDir(UUID uuid)
```
    Description copied from interface: HadoopJobStrategy
    
    Create the job output directory with name from a uuid.
    
    Specified by:
    
    createJobOutputDir in interface HadoopJobStrategy
    
    Parameters:
    
    uuid - The UUID to create a unique name from.
    
    Returns:
    
    Path specifying where the output directory is located.
  - getJobType
```
public String getJobType()
```
    Description copied from interface: HadoopJobStrategy
    
    Return a string specifying which kind of job is being run.
    
    Specified by:
    
    getJobType in interface HadoopJobStrategy
    
    Returns:
    
    String specifying the job's type.

Class CrawlLogExtractionStrategy

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

CrawlLogExtractionStrategy

Method Detail

runJob

createJobInputFile

createJobOutputDir

getJobType