dk.netarkivet.harvester.harvesting.frontier
Class FullFrontierReport

java.lang.Object
  extended by dk.netarkivet.harvester.harvesting.frontier.AbstractFrontierReport
      extended by dk.netarkivet.harvester.harvesting.frontier.FullFrontierReport
All Implemented Interfaces:
FrontierReport, java.io.Serializable

public class FullFrontierReport
extends AbstractFrontierReport

Wraps an Heritrix full frontier report. As these reports can be big in size, this implementation relies on Berkeley DB direct persistence layer to store the report lines, allowing to store the lines partially in memory, and on disk.

See Also:
Serialized Form

Nested Class Summary
(package private) static class FullFrontierReport.PersistentLine
           
(package private) static class FullFrontierReport.PersistentLineKey
           
 class FullFrontierReport.ReportIterator
           
 
Method Summary
 void addLine(FrontierReportLine line)
          Add a line to the report.
 void dispose()
          Releases all resources once this report is to be discarded.
 FrontierReportLine getLineForDomain(java.lang.String domainName)
          Returns the line of the frontier report corresponding to the queue for the given domain name.
(package private)  java.io.File getStorageDir()
          Return the directory where the BDB is stored.
 FullFrontierReport.ReportIterator iterateOnCurrentSize()
          Returns an iterator where lines are ordered by increasing currentSize.
 FullFrontierReport.ReportIterator iterateOnDomainName()
          Returns an iterator where lines are ordered by domain name natural order.
 FullFrontierReport.ReportIterator iterateOnDuplicateCurrentSize(long dupValue)
          Returns an iterator on lines having a given currentSize.
 FullFrontierReport.ReportIterator iterateOnDuplicateSpentBudget(long dupValue)
          Returns an iterator on lines having a given totalSpend.
 FullFrontierReport.ReportIterator iterateOnSpentBudget()
          Returns an iterator where lines are ordered by increasing totalSpend.
 FullFrontierReport.ReportIterator iterateOnTotalEnqueues()
          Returns an iterator where lines are ordered by primary key order: first by decreasing totalEnqueues, then by domain name natural order.
static FullFrontierReport parseContentsAsString(java.lang.String jobName, java.lang.String contentsAsString)
          Generates an Heritrix frontier report wrapper object by parsing the frontier report returned by the JMX controller as a string.
 
Methods inherited from class dk.netarkivet.harvester.harvesting.frontier.AbstractFrontierReport
getJobName, getTimestamp, setJobName, setTimestamp
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

dispose

public void dispose()
Releases all resources once this report is to be discarded. NB this method MUST be explicitely called!


addLine

public void addLine(FrontierReportLine line)
Description copied from interface: FrontierReport
Add a line to the report.

Specified by:
addLine in interface FrontierReport
Specified by:
addLine in class AbstractFrontierReport
Parameters:
line - line to add.

getLineForDomain

public FrontierReportLine getLineForDomain(java.lang.String domainName)
Description copied from interface: FrontierReport
Returns the line of the frontier report corresponding to the queue for the given domain name.

Specified by:
getLineForDomain in interface FrontierReport
Specified by:
getLineForDomain in class AbstractFrontierReport
Parameters:
domainName - the domain name.
Returns:
null if no queue for this domain name exists, otherwise the line of the frontier report corresponding to the queue for the given domain name.

iterateOnTotalEnqueues

public FullFrontierReport.ReportIterator iterateOnTotalEnqueues()
Returns an iterator where lines are ordered by primary key order: first by decreasing totalEnqueues, then by domain name natural order.

Returns:
an iterator on the report lines.

iterateOnDomainName

public FullFrontierReport.ReportIterator iterateOnDomainName()
Returns an iterator where lines are ordered by domain name natural order.

Returns:
an iterator on the report lines.

iterateOnCurrentSize

public FullFrontierReport.ReportIterator iterateOnCurrentSize()
Returns an iterator where lines are ordered by increasing currentSize.

Returns:
an iterator on the report lines.

iterateOnDuplicateCurrentSize

public FullFrontierReport.ReportIterator iterateOnDuplicateCurrentSize(long dupValue)
Returns an iterator on lines having a given currentSize.

Returns:
an iterator on the report lines.

iterateOnSpentBudget

public FullFrontierReport.ReportIterator iterateOnSpentBudget()
Returns an iterator where lines are ordered by increasing totalSpend.

Returns:
an iterator on the report lines.

iterateOnDuplicateSpentBudget

public FullFrontierReport.ReportIterator iterateOnDuplicateSpentBudget(long dupValue)
Returns an iterator on lines having a given totalSpend.

Returns:
an iterator on the report lines.

parseContentsAsString

public static FullFrontierReport parseContentsAsString(java.lang.String jobName,
                                                       java.lang.String contentsAsString)
Generates an Heritrix frontier report wrapper object by parsing the frontier report returned by the JMX controller as a string.

Parameters:
jobName - the Heritrix job name
contentsAsString - the text returned by the JMX call
Returns:
the report wrapper object

getStorageDir

java.io.File getStorageDir()
Return the directory where the BDB is stored.

Returns:
the storage directory.