dk.netarkivet.harvester.scheduler.jobgen
Class AbstractJobGenerator

java.lang.Object
  extended by dk.netarkivet.harvester.scheduler.jobgen.AbstractJobGenerator
All Implemented Interfaces:
JobGenerator
Direct Known Subclasses:
DefaultJobGenerator, FixedDomainConfigurationCountJobGenerator

abstract class AbstractJobGenerator
extends java.lang.Object
implements JobGenerator

A base class for JobGenerator implementations. It is recommended to extend this class to implement a new job generator. The base algorithm iterates over domain configurations within the harvest definition, and according to the configuration (HarvesterSettings.JOBGEN_DOMAIN_CONFIG_SUBSET_SIZE, constitutes a subset of domain configurations from which one or more jobs will be generated.


Constructor Summary
AbstractJobGenerator()
           
 
Method Summary
 boolean canAccept(Job job, DomainConfiguration cfg)
          Tests if a configuration fits into this Job.
protected abstract  boolean checkSpecificAcceptConditions(Job job, DomainConfiguration cfg)
          Called by canAccept(Job, DomainConfiguration).
protected  void editJobOrderXml(Job job)
          Once the job has been filled with DomainConfigurations, performs the following operations: Edit the harvest template to add/remove deduplicator configuration.
 int generateJobs(HarvestDefinition harvest)
          Generates a series of jobs for the given harvest definition.
protected abstract  java.util.Comparator<DomainConfiguration> getDomainConfigurationSubsetComparator(HarvestDefinition harvest)
          Returns a comparator used to sort the subset of DOMAIN_CONFIG_SUBSET_SIZE configurations that are scanned at each iteration.
static Job getNewJob(HarvestDefinition harvest, DomainConfiguration cfg)
          Instantiates a new job.
protected abstract  int processDomainConfigurationSubset(HarvestDefinition harvest, java.util.Iterator<DomainConfiguration> domainConfSubset)
          Create new jobs from a collection of configurations.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractJobGenerator

AbstractJobGenerator()
Method Detail

generateJobs

public int generateJobs(HarvestDefinition harvest)
Description copied from interface: JobGenerator
Generates a series of jobs for the given harvest definition. Note that a job generator is expected to follow the singleton pattern, so implementations of this method should be thread-safe.

Specified by:
generateJobs in interface JobGenerator
Parameters:
harvest - the harvest definition to process.
Returns:
the number of jobs that were generated.

getNewJob

public static Job getNewJob(HarvestDefinition harvest,
                            DomainConfiguration cfg)
Instantiates a new job.

Parameters:
cfg - the DomainConfiguration being processed
harvest - the HarvestDefinition being processed
Returns:
an instance of Job

getDomainConfigurationSubsetComparator

protected abstract java.util.Comparator<DomainConfiguration> getDomainConfigurationSubsetComparator(HarvestDefinition harvest)
Returns a comparator used to sort the subset of DOMAIN_CONFIG_SUBSET_SIZE configurations that are scanned at each iteration.

Parameters:
harvest - the HarvestDefinition being processed.
Returns:
a comparator

processDomainConfigurationSubset

protected abstract int processDomainConfigurationSubset(HarvestDefinition harvest,
                                                        java.util.Iterator<DomainConfiguration> domainConfSubset)
Create new jobs from a collection of configurations. All configurations must use the same order.xml file.Jobs

Parameters:
harvest - the HarvestDefinition being processed.
domainConfSubset - the configurations to use to create the jobs
Returns:
The number of jobs created
Throws:
ArgumentNotValid - if any of the parameters is null or if the cfglist does not contain any configurations

canAccept

public boolean canAccept(Job job,
                         DomainConfiguration cfg)
Description copied from interface: JobGenerator
Tests if a configuration fits into this Job. First tests if it's the right type of order-template and bytelimit, and whether the bytelimit is right for the job. The Job limits are compared against the configuration estimates and if no limits are exceeded true is returned otherwise false is returned.

Specified by:
canAccept in interface JobGenerator
Parameters:
job - the job being built.
cfg - the configuration to check
Returns:
true if adding the configuration to this Job does not exceed any of the Job limits.

checkSpecificAcceptConditions

protected abstract boolean checkSpecificAcceptConditions(Job job,
                                                         DomainConfiguration cfg)
Called by canAccept(Job, DomainConfiguration). Tests the implementation-specific conditions to accept the given DomainConfiguration in the given Job. It is assumed that checkAddDomainConfInvariant(Job, DomainConfiguration) has already passed.

Parameters:
job - the Job n=being built
cfg - the DomainConfiguration to test
Returns:
true if the configuration passes the conditions.

editJobOrderXml

protected void editJobOrderXml(Job job)
Once the job has been filled with DomainConfigurations, performs the following operations:
  1. Edit the harvest template to add/remove deduplicator configuration.

Parameters:
job - the job