dk.netarkivet.harvester.harvesting
Class ContentSizeAnnotationPostProcessor

java.lang.Object
  extended by javax.management.Attribute
      extended by org.archive.crawler.settings.Type
          extended by org.archive.crawler.settings.ComplexType
              extended by org.archive.crawler.settings.ModuleType
                  extended by org.archive.crawler.framework.Processor
                      extended by dk.netarkivet.harvester.harvesting.ContentSizeAnnotationPostProcessor
All Implemented Interfaces:
java.io.Serializable, javax.management.DynamicMBean

public class ContentSizeAnnotationPostProcessor
extends org.archive.crawler.framework.Processor

A post processor that adds an annotation content-size: for each succesfully harvested URI.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ComplexType.MBeanAttributeInfoIterator
 
Field Summary
static java.lang.String CONTENT_SIZE_ANNOTATION_PREFIX
          Prefix asssociated with annotations made by this processor.
 
Fields inherited from class org.archive.crawler.framework.Processor
ATTR_DECIDE_RULES, ATTR_ENABLED, attrDecideRules
 
Fields inherited from class org.archive.crawler.settings.ComplexType
definition, definitionMap
 
Constructor Summary
ContentSizeAnnotationPostProcessor(java.lang.String name)
          Constructor.
 
Method Summary
protected  void innerProcess(org.archive.crawler.datamodel.CrawlURI crawlURI)
          For each URI with a succesful status code (status code > 0), add annotation with content size.
 
Methods inherited from class org.archive.crawler.framework.Processor
checkForInterrupt, finalTasks, getController, getDecideRule, getDefaultNextProcessor, initialTasks, innerRejectProcess, isContentToProcess, isEnabled, isExpectedMimeType, isHttpTransactionContentToProcess, kickUpdate, process, report, rulesAccept, rulesAccept, setDefaultNextProcessor, spawn
 
Methods inherited from class org.archive.crawler.settings.ModuleType
addElement, listUsedFiles
 
Methods inherited from class org.archive.crawler.settings.ComplexType
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, toString, unsetAttribute
 
Methods inherited from class org.archive.crawler.settings.Type
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient
 
Methods inherited from class javax.management.Attribute
getName, hashCode
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

CONTENT_SIZE_ANNOTATION_PREFIX

public static final java.lang.String CONTENT_SIZE_ANNOTATION_PREFIX
Prefix asssociated with annotations made by this processor.

See Also:
Constant Field Values
Constructor Detail

ContentSizeAnnotationPostProcessor

public ContentSizeAnnotationPostProcessor(java.lang.String name)
Constructor.

Parameters:
name - the name of the processor.
See Also:
Processor
Method Detail

innerProcess

protected void innerProcess(org.archive.crawler.datamodel.CrawlURI crawlURI)
                     throws java.lang.InterruptedException
For each URI with a succesful status code (status code > 0), add annotation with content size.

Overrides:
innerProcess in class org.archive.crawler.framework.Processor
Parameters:
crawlURI - URI to add annotation for if succesful.
Throws:
ArgumentNotValid - if crawlURI is null.
java.lang.InterruptedException - never.
See Also:
Processor.innerProcess(org.archive.crawler.datamodel.CrawlURI)