dk.netarkivet.harvester.datamodel
Class HarvestDefinition

java.lang.Object
  extended by dk.netarkivet.harvester.datamodel.HarvestDefinition
All Implemented Interfaces:
Named
Direct Known Subclasses:
FullHarvest, PartialHarvest

public abstract class HarvestDefinition
extends java.lang.Object
implements Named

This abstract class models the general properties of a harvest definition, i.e. object id , name, comments, and submission date

The specializing classes FullHarvest and PartielHarvest contains the specific properties and operations of snapshot harvestdefinitions and all other kinds of harvestdefinitions, respectively.

Methods exist to generate jobs from this harvest definition.


Field Summary
protected  java.lang.String comments
           
protected  long edition
          Edition is used by the DAO to keep track of changes.
protected  java.lang.String harvestDefName
           
protected  boolean isActive
          Determines if the harvest definition is active and ready for scheduling.
protected  int numEvents
          The number of times this event has already run.
protected  java.lang.Long oid
           
protected  java.util.Date submissionDate
          The time this harvest definition was first written.
 
Constructor Summary
HarvestDefinition()
           
 
Method Summary
static FullHarvest createFullHarvest(java.lang.String harvestDefName, java.lang.String comments, java.lang.Long prevHarvestOid, long maxCountObjects, long maxBytes, long maxJobRunningTime)
          Create snapshot harvestdefinition.
 int createJobs()
          Create Jobs from the configurations in this harvestdefinition and the current value of the limits in Settings.
static PartialHarvest createPartialHarvest(java.util.List<DomainConfiguration> domainConfigurations, Schedule schedule, java.lang.String harvestDefName, java.lang.String comments)
          Create new instance of a PartialHavest configured according to the properties of the supplied DomainConfiguration.
 boolean equals(java.lang.Object o)
          Tests whether some other object is "equal to" this HarvestDefinition.
 boolean getActive()
          Returns the activation status.
 java.lang.String getComments()
          Returns the comments for this harvest definition.
abstract  java.util.Iterator<DomainConfiguration> getDomainConfigurations()
          Returns a iterator of domain configurations for this harvest definition.
 long getEdition()
          Get the edition number.
protected abstract  long getMaxBytes()
          Returns how many bytes to harvest per domain, or -1 for no limit.
protected abstract  long getMaxCountObjects()
          Returns how many objects to harvest per domain, or 0 for no limit.
 java.lang.String getName()
          Returns the name of the harvest definition.
protected abstract  Job getNewJob(DomainConfiguration cfg)
          Get a new Job suited for this type of HarvestDefinition.
 int getNumEvents()
          Get the number of times this harvest definition has been run so far.
 java.lang.Long getOid()
          Return the object ID of this harvest definition.
 java.util.Date getSubmissionDate()
          Returns the submission date.
 int hashCode()
          Returns a hashcode of this object generated on fields oid, harvestDefName, and comments.
(package private)  boolean hasID()
          Check if this harvestdefinition has an ID set yet (doesn't happen until the DBDAO persists it).
abstract  boolean isSnapShot()
          Used to check if a harvestdefinition is a snapshot harvestdefinition.
protected  int makeJobs(java.util.Iterator<DomainConfiguration> cfglist)
          Create new jobs from a collection of configurations.
abstract  boolean runNow(java.util.Date now)
          Check if this harvest definition should be run, given the time now.
 void setActive(boolean active)
          Set's activation status.
 void setComments(java.lang.String comments)
          Set the comments for this harvest definition.
 void setEdition(long theEdition)
          Set the edition number.
(package private)  void setNumEvents(int numEvents)
          Set the number of times this harvest definition has been run so far.
 void setOid(java.lang.Long oid)
          Set the object ID of this harvest definition.
 void setSubmissionDate(java.util.Date submissionDate)
          Set the submission date.
 java.lang.String toString()
          Return a human-readable string representation of this object.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

oid

protected java.lang.Long oid

harvestDefName

protected java.lang.String harvestDefName

submissionDate

protected java.util.Date submissionDate
The time this harvest definition was first written.


comments

protected java.lang.String comments

edition

protected long edition
Edition is used by the DAO to keep track of changes.


isActive

protected boolean isActive
Determines if the harvest definition is active and ready for scheduling. When true the jobs should be scheduled otherwise the scheduler should ignore the definition. Initially a definition is assumed active - the original behaviour before the isActive flag was introduced.


numEvents

protected int numEvents
The number of times this event has already run.

Constructor Detail

HarvestDefinition

public HarvestDefinition()
Method Detail

createPartialHarvest

public static PartialHarvest createPartialHarvest(java.util.List<DomainConfiguration> domainConfigurations,
                                                  Schedule schedule,
                                                  java.lang.String harvestDefName,
                                                  java.lang.String comments)
Create new instance of a PartialHavest configured according to the properties of the supplied DomainConfiguration.

Parameters:
domainConfigurations - a list of domain configurations
schedule - the harvest definition schedule
harvestDefName - the name of the harvest definition
comments - comments
Returns:
the newly created PartialHarvest

createFullHarvest

public static FullHarvest createFullHarvest(java.lang.String harvestDefName,
                                            java.lang.String comments,
                                            java.lang.Long prevHarvestOid,
                                            long maxCountObjects,
                                            long maxBytes,
                                            long maxJobRunningTime)
Create snapshot harvestdefinition. A snapshot harvestdefinition creates jobs for all domains, using the default configuration for each domain. The HarvestDefinition is scheduled to run once as soon as possible.

When a previous harvest definition is supplied, only domains not completely harvested by the previous harvestdefinition are included in this harvestdefinition. indexready set to false.

Parameters:
harvestDefName - the name of the harvest definition
comments - description of the harvestdefinition
prevHarvestOid - an id of a previous harvest to use as basis for this definition, ignored when null.
maxCountObjects - the maximum number of objects harvested from any domain
maxBytes - the maximum number of bytes harvested from any domain
maxJobRunningTime - The maximum running time for each job
Returns:
a snapshot harvestdefinition

setOid

public void setOid(java.lang.Long oid)
Set the object ID of this harvest definition.

Parameters:
oid - The oid
Throws:
ArgumentNotValid - if the oid is null

getOid

public java.lang.Long getOid()
Return the object ID of this harvest definition.

Returns:
The object id, or null if none.

hasID

boolean hasID()
Check if this harvestdefinition has an ID set yet (doesn't happen until the DBDAO persists it).

Returns:
true, if this harvestdefinition has an ID set

setSubmissionDate

public void setSubmissionDate(java.util.Date submissionDate)
Set the submission date.

Parameters:
submissionDate - the time when the harvestdefinition was created

getSubmissionDate

public java.util.Date getSubmissionDate()
Returns the submission date.

Returns:
the submission date

getName

public java.lang.String getName()
Returns the name of the harvest definition.

Specified by:
getName in interface Named
Returns:
the harvest definition name

getComments

public java.lang.String getComments()
Returns the comments for this harvest definition.

Specified by:
getComments in interface Named
Returns:
the comments for this harvest definition.

setComments

public void setComments(java.lang.String comments)
Set the comments for this harvest definition.

Parameters:
comments - A user-entered string.

getEdition

public long getEdition()
Get the edition number.

Returns:
The edition number

setEdition

public void setEdition(long theEdition)
Set the edition number.

Parameters:
theEdition - the new edition of the harvestdefinition

getNumEvents

public int getNumEvents()
Get the number of times this harvest definition has been run so far.

Returns:
That number

setNumEvents

void setNumEvents(int numEvents)
Set the number of times this harvest definition has been run so far.

Parameters:
numEvents - The number.
Throws:
ArgumentNotValid - if numEvents is negative

setActive

public void setActive(boolean active)
Set's activation status. Only active harvestdefinitions should be scheduled.

Parameters:
active - new activation status

getActive

public boolean getActive()
Returns the activation status.

Returns:
activation status

getDomainConfigurations

public abstract java.util.Iterator<DomainConfiguration> getDomainConfigurations()
Returns a iterator of domain configurations for this harvest definition.

Returns:
Iterator containing information about the domain configurations

createJobs

public int createJobs()
Create Jobs from the configurations in this harvestdefinition and the current value of the limits in Settings. The following values are used: dk.netarkivet.datamodel.jobs.maxRelativeSizeDifference: The maximum relative difference between the smallest and largest number of objects expected in a job

dk.netarkivet.datamodel.jobs.minAbsolutSizeDifference Size differences below this threshold are ignored even if the relative difference exceeds maxRelativeSizeDifference

dk.netarkivet.datamodel.jobs.maxTotalSize The upper limit on the total number of objects that a job may retrieve

Returns:
The number of jobs created

makeJobs

protected int makeJobs(java.util.Iterator<DomainConfiguration> cfglist)
Create new jobs from a collection of configurations. All configurations must use the same order.xml file.

Parameters:
cfglist - the configurations to use to create the jobs
Returns:
The number of jobs created
Throws:
ArgumentNotValid - if any of the parameters is null or if the cfglist does not contain any configurations

getNewJob

protected abstract Job getNewJob(DomainConfiguration cfg)
Get a new Job suited for this type of HarvestDefinition.

Parameters:
cfg - The configuration to use when creating the job
Returns:
a new job

toString

public java.lang.String toString()
Return a human-readable string representation of this object.

Overrides:
toString in class java.lang.Object
Returns:
A human-readable string representation of this object

equals

public boolean equals(java.lang.Object o)
Tests whether some other object is "equal to" this HarvestDefinition. Cfr. documentation of java.lang.Object.equals()

Overrides:
equals in class java.lang.Object
Parameters:
o -
Returns:
True or false, indicating equality.

hashCode

public int hashCode()
Returns a hashcode of this object generated on fields oid, harvestDefName, and comments.

Overrides:
hashCode in class java.lang.Object
Returns:
the hashCode

runNow

public abstract boolean runNow(java.util.Date now)
Check if this harvest definition should be run, given the time now.

Parameters:
now - The current time
Returns:
true if harvest definition should be run

isSnapShot

public abstract boolean isSnapShot()
Used to check if a harvestdefinition is a snapshot harvestdefinition.

Returns:
true if this harvestdefinition defines a snapshot harvest

getMaxCountObjects

protected abstract long getMaxCountObjects()
Returns how many objects to harvest per domain, or 0 for no limit.

Returns:
how many objects to harvest per domain

getMaxBytes

protected abstract long getMaxBytes()
Returns how many bytes to harvest per domain, or -1 for no limit.

Returns:
how many bytes to harvest per domain