public class Job extends Object implements Serializable, JobInfo
A job may also be limited on bytes or objects, defined either by the configurations in the job or the harvest definition the job is generated by.
The job contains the order file, the seedlist and the current status of the job, as well as the ID of the harvest definition that defined it and names of all the configurations it is based on.
Modifier and Type | Field and Description |
---|---|
protected Long |
origHarvestDefinitionID
The Id of the harvestdefinition, that generated this job.
|
protected JobStatus |
status
The status of the job.
|
Modifier | Constructor and Description |
---|---|
protected |
Job() |
|
Job(Long harvestID,
DomainConfiguration cfg,
HeritrixTemplate orderXMLdoc,
HarvestChannel channel,
long forceMaxObjectsPerDomain,
long forceMaxBytesPerDomain,
long forceMaxJobRunningTime,
int harvestNum)
Package private constructor for common initialisation.
|
Modifier and Type | Method and Description |
---|---|
void |
addConfiguration(DomainConfiguration cfg)
Adds a configuration to this Job.
|
void |
appendHarvestErrorDetails(String harvestErrorDetails)
Append to the list of harvest error details for this job.
|
void |
appendHarvestErrors(String harvestErrors)
Append to the list of harvest errors for this job.
|
void |
appendUploadErrorDetails(String uploadErrorDetails)
Append to the list of upload error details.
|
void |
appendUploadErrors(String uploadErrors)
Append to the list of upload errors.
|
Date |
getActualStart()
Get the actual time when this job was started.
|
Date |
getActualStop()
Get the actual time when this job was stopped/completed.
|
String |
getChannel() |
Long |
getContinuationOf() |
int |
getCountDomains()
Get's the total number of different domains harvested by this job.
|
Date |
getCreationDate()
Get the time when this job was created.
|
Map<String,String> |
getDomainConfigurationMap()
Returns a map of domain names and name of their corresponding configuration.
|
long |
getForceMaxBytesPerDomain() |
long |
getForceMaxObjectsPerDomain() |
String |
getHarvestAudience() |
String |
getHarvestErrorDetails()
Get the list of harvest error details for this job.
|
String |
getHarvestErrors()
Get the list of harvest errors for this job.
|
String |
getHarvestFilenamePrefix()
Get the harvestFilename prefix.
|
int |
getHarvestNum()
Get the harvestNum for this job.
|
Long |
getJobID()
Get the id of this Job.
|
long |
getMaxBytesPerDomain()
Gets the maximum number of bytes harvested per domain.
|
long |
getMaxCountObjects() |
long |
getMaxJobRunningTime() |
long |
getMaxObjectsPerDomain()
Gets the maximum number of objects harvested per domain.
|
long |
getMinCountObjects() |
HeritrixTemplate |
getOrderXMLdoc()
Gets a document representation of the order.xml associated with this Job.
|
String |
getOrderXMLName()
Get the name of the order XML file used by this Job.
|
Long |
getOrigHarvestDefinitionID()
Get the id of the HarvestDefinition from which this job originates.
|
Long |
getResubmittedAsJob()
Get the ID for the job which this job was resubmitted as.
|
String |
getSeedListAsString()
Get the seedlist as a String.
|
File[] |
getSettingsXMLfiles()
Get a list of Heritrix settings.xml files.
|
List<String> |
getSortedSeedList()
Returns a list of sorted seeds for this job.
|
JobStatus |
getStatus()
Get the current status of this Job.
|
Date |
getSubmittedDate()
Get the time when this job was submitted.
|
long |
getTotalCountObjects() |
String |
getUploadErrorDetails()
Get the list of upload error details.
|
String |
getUploadErrors()
Get the list of upload errors.
|
boolean |
isConfigurationSetsByteLimit() |
boolean |
isConfigurationSetsObjectLimit() |
boolean |
isSnapshot() |
void |
setActualStart(Date actualStart)
Set the actual time when this job was started.
|
void |
setActualStop(Date actualStop)
Set the actual time when this job was stopped/completed.
|
void |
setAttributes(List<EAV.AttributeAndType> attributesAndTypes) |
void |
setChannel(String channel)
Sets the associated
HarvestChannel name. |
void |
setCreationDate(Date creationDate)
Set the Date for when this job was created.
|
void |
setHarvestAudience(String theAudience)
Set the harvest audience for this job.
|
void |
setHarvestChannel(HarvestChannel harvestChannel) |
void |
setHarvestFilenamePrefix(String prefix) |
void |
setHarvestNum(int harvestNum)
Set the harvestNum for this job.
|
void |
setJobID(Long id)
Set the id of this Job.
|
protected void |
setMaxBytesPerDomain(long maxBytesPerDomain)
Set the maxbytes per domain value.
|
protected void |
setMaxJobRunningTime(long maxJobRunningTime)
Set the maxJobRunningTime value.
|
protected void |
setMaxObjectsPerDomain(long maxObjectsPerDomain)
Sets the maxObjectsPerDomain value.
|
void |
setOrderXMLDoc(HeritrixTemplate doc)
Set the orderxml for this job.
|
void |
setResubmittedAsJob(Long resubmittedAsJob)
Set the ID for the job which this job was resubmitted as.
|
void |
setSeedList(String seedList)
Set the seedlist of the job from the seedList argument.
|
void |
setSnapshot(boolean isSnapshot)
Sets whether job belongs to a snapshot or focused harvest.
|
void |
setStatus(JobStatus newStatus)
Sets status of this job.
|
void |
setSubmittedDate(Date submittedDate)
Set the Date for when this job was submitted.
|
String |
toString() |
protected Long origHarvestDefinitionID
protected Job()
public Job(Long harvestID, DomainConfiguration cfg, HeritrixTemplate orderXMLdoc, HarvestChannel channel, long forceMaxObjectsPerDomain, long forceMaxBytesPerDomain, long forceMaxJobRunningTime, int harvestNum) throws ArgumentNotValid
harvestID
- the id of the harvestdefinitioncfg
- the configuration to base the Job onorderXMLdoc
- channel
- the channel on which the job will be submitted.forceMaxObjectsPerDomain
- the maximum number of objects harvested from a domain, overrides individual
configuration settings. -1 means no limitforceMaxBytesPerDomain
- The maximum number of objects harvested from a domain, or -1 for no limit.forceMaxJobRunningTime
- The max time in seconds given to the harvester for this jobharvestNum
- the run number of the harvest definitionArgumentNotValid
- if cfg or priority is null or harvestID is invalid, or if any limit < -1public void setAttributes(List<EAV.AttributeAndType> attributesAndTypes)
public void addConfiguration(DomainConfiguration cfg)
cfg
- the configuration to addArgumentNotValid
- if cfg is null or cfg uses a different orderxml than this job or if this job already
contains a configuration associated with domain of configuration cfg.public String getOrderXMLName()
public Date getActualStop()
public Date getActualStart()
public Date getSubmittedDate()
public Date getCreationDate()
Date
public File[] getSettingsXMLfiles()
public Long getOrigHarvestDefinitionID()
getOrigHarvestDefinitionID
in interface JobInfo
public void setJobID(Long id)
id
- The Id for this job.public int getCountDomains()
public void setActualStart(Date actualStart)
Sends a notification, if actualStart is set to a time after actualStop.
actualStart
- A Date object representing the time when this job was started.public void setActualStop(Date actualStop) throws ArgumentNotValid
actualStop
- A Date object representing the time when this job was stopped.ArgumentNotValid
public void setOrderXMLDoc(HeritrixTemplate doc)
doc
- A orderxml to be used by this jobpublic HeritrixTemplate getOrderXMLdoc()
public void setSeedList(String seedList)
seedList
- List of seeds as one Stringpublic String getSeedListAsString()
public JobStatus getStatus()
public void setStatus(JobStatus newStatus)
newStatus
- Must be one of the values STATUS_NEW, ..., STATUS_FAILEDArgumentNotValid
- in case of invalid status argument or invalid status changepublic Map<String,String> getDomainConfigurationMap()
The returned Map cannot be changed.
public long getMaxObjectsPerDomain()
public long getMaxBytesPerDomain()
public void setHarvestChannel(HarvestChannel harvestChannel)
public String getChannel()
HarvestChannel
name.public void setChannel(String channel)
HarvestChannel
name.channel
- the channel namepublic boolean isSnapshot()
public void setSnapshot(boolean isSnapshot)
isSnapshot
- true if the job belongs to a snapshot harvest, false if it belongs to a focused harvest.public long getForceMaxObjectsPerDomain()
protected void setMaxObjectsPerDomain(long maxObjectsPerDomain)
maxObjectsPerDomain
- The forceMaxObjectsPerDomain to set. 0 means no limit.IOFailure
- Thrown from auxiliary method editOrderXML_maxObjectsPerDomain.protected void setMaxBytesPerDomain(long maxBytesPerDomain)
maxBytesPerDomain
- The maxBytesPerDomain to set, or -1 for no limit.protected void setMaxJobRunningTime(long maxJobRunningTime)
maxJobRunningTime
- The maxJobRunningTime in seconds to set, or 0 for no limit.public long getMaxJobRunningTime()
public int getHarvestNum()
public void setHarvestNum(int harvestNum)
harvestNum
- a given harvestNumpublic String getHarvestErrors()
public void appendHarvestErrors(String harvestErrors)
harvestErrors
- a string containing harvest errors (may be null)public String getHarvestErrorDetails()
public void appendHarvestErrorDetails(String harvestErrorDetails)
harvestErrorDetails
- a string containing harvest error details.public String getUploadErrors()
public void appendUploadErrors(String uploadErrors)
uploadErrors
- a string containing upload errors.public String getUploadErrorDetails()
public void appendUploadErrorDetails(String uploadErrorDetails)
uploadErrorDetails
- a string containing upload error details.public Long getResubmittedAsJob()
public void setSubmittedDate(Date submittedDate)
submittedDate
- The date when this was submittedpublic void setCreationDate(Date creationDate)
creationDate
- The date when this was createdpublic void setResubmittedAsJob(Long resubmittedAsJob)
resubmittedAsJob
- An Id for a new job.public Long getContinuationOf()
public String getHarvestFilenamePrefix()
JobInfo
getHarvestFilenamePrefix
in interface JobInfo
public void setHarvestFilenamePrefix(String prefix)
prefix
- public long getForceMaxBytesPerDomain()
public boolean isConfigurationSetsObjectLimit()
public boolean isConfigurationSetsByteLimit()
public long getMinCountObjects()
public long getMaxCountObjects()
public long getTotalCountObjects()
public String getHarvestAudience()
public void setHarvestAudience(String theAudience)
theAudience
- the harvest-audience.public List<String> getSortedSeedList()
Copyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.