dk.netarkivet.harvester.harvesting.frontier
Class FullFrontierReport

java.lang.Object
  extended by dk.netarkivet.harvester.harvesting.frontier.AbstractFrontierReport
      extended by dk.netarkivet.harvester.harvesting.frontier.FullFrontierReport
All Implemented Interfaces:
FrontierReport, java.io.Serializable

public class FullFrontierReport
extends AbstractFrontierReport

Wraps an Heritrix full frontier report. As these reports can be big in size, this implementation relies on Berkeley DB direct persistence layer to store the report lines, allowing to store the lines partially in memory, and on disk.

See Also:
Serialized Form

Nested Class Summary
(package private) static class FullFrontierReport.PersistentLine
           
(package private) static class FullFrontierReport.PersistentLineKey
           
 
Method Summary
 void addLine(FrontierReportLine line)
          Add a line to the report.
 void dispose()
          Releases all resources once this report is to be discarded.
 FrontierReportLine[] getBiggestTotalEnqueues(int howMany)
          Returns the N lines with the biggest totalEnqueues values, corresponding to active queues (i.e.
 FrontierReportLine[] getExhaustedQueues(int maxSize)
          Returns the exhausted queues, e.g.
 FrontierReportLine getLineForDomain(java.lang.String domainName)
          Returns the line of the frontier report corresponding to the queue for the given domain name.
 FrontierReportLine[] getRetiredQueues(int maxSize)
          Returns the retired queues, e.g.
(package private)  java.io.File getStorageDir()
          Return the directory where the BDB is stored.
static FullFrontierReport parseContentsAsString(java.lang.String jobName, java.lang.String contentsAsString)
          Generates an Heritrix frontier report wrapper object by parsing the frontier report returned by the JMX controller as a string.
 
Methods inherited from class dk.netarkivet.harvester.harvesting.frontier.AbstractFrontierReport
getJobName, getTimestamp, setJobName, setTimestamp
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

dispose

public void dispose()
Releases all resources once this report is to be discarded. NB this method MUST be explicitely called!


addLine

public void addLine(FrontierReportLine line)
Description copied from interface: FrontierReport
Add a line to the report.

Specified by:
addLine in interface FrontierReport
Specified by:
addLine in class AbstractFrontierReport
Parameters:
line - line to add.

getLineForDomain

public FrontierReportLine getLineForDomain(java.lang.String domainName)
Description copied from interface: FrontierReport
Returns the line of the frontier report corresponding to the queue for the given domain name.

Specified by:
getLineForDomain in interface FrontierReport
Specified by:
getLineForDomain in class AbstractFrontierReport
Parameters:
domainName - the domain name.
Returns:
null if no queue for this domain name exists, otherwise the line of the frontier report corresponding to the queue for the given domain name.

getBiggestTotalEnqueues

public FrontierReportLine[] getBiggestTotalEnqueues(int howMany)
Returns the N lines with the biggest totalEnqueues values, corresponding to active queues (i.e. not exhasuted or retired).

Parameters:
howMany - how many lines to fetch (N)
Returns:
the N lines with the biggest totalEnqueues values.

parseContentsAsString

public static FullFrontierReport parseContentsAsString(java.lang.String jobName,
                                                       java.lang.String contentsAsString)
Generates an Heritrix frontier report wrapper object by parsing the frontier report returned by the JMX controller as a string.

Parameters:
jobName - the Heritrix job name
contentsAsString - the text returned by the JMX call
Returns:
the report wrapper object

getStorageDir

java.io.File getStorageDir()
Return the directory where the BDB is stored.

Returns:
the storage directory.

getRetiredQueues

public FrontierReportLine[] getRetiredQueues(int maxSize)
Returns the retired queues, e.g. the queues that have hit the totalBudget value (queue-total-budget).

Specified by:
getRetiredQueues in interface FrontierReport
Specified by:
getRetiredQueues in class AbstractFrontierReport
Parameters:
maxSize - maximum count of elements to fetch
Returns:
an array of retired queues.

getExhaustedQueues

public FrontierReportLine[] getExhaustedQueues(int maxSize)
Returns the exhausted queues, e.g. the queues whose current size is zero.

Specified by:
getExhaustedQueues in interface FrontierReport
Specified by:
getExhaustedQueues in class AbstractFrontierReport
Parameters:
maxSize - maximum count of elements to fetch
Returns:
an array of exhausted queues.