Class ContentSizeAnnotationPostProcessor

  • All Implemented Interfaces:
    org.archive.checkpointing.Checkpointable, org.archive.spring.HasKeyedProperties, org.springframework.beans.factory.Aware, org.springframework.beans.factory.BeanNameAware, org.springframework.context.Lifecycle

    public class ContentSizeAnnotationPostProcessor
    extends org.archive.modules.Processor
    A post processor that adds an annotation content-size: for each successfully harvested URI. The bean for this processor should be added to the list of dispositionProcessors.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static String CONTENT_SIZE_ANNOTATION_PREFIX
      Prefix associated with annotations made by this processor.
      • Fields inherited from class org.archive.modules.Processor

        beanName, isRunning, kp, recoveryCheckpoint, uriCount
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void innerProcess​(org.archive.modules.CrawlURI crawlURI)
      For each URI with a successful status code (status code > 0), add annotation with content size.
      protected boolean shouldProcess​(org.archive.modules.CrawlURI arg0)  
      • Methods inherited from class org.archive.modules.Processor

        doCheckpoint, finishCheckpoint, flattenVia, fromCheckpointJson, getBeanName, getEnabled, getKeyedProperties, getRecordedSize, getShouldProcessRule, getURICount, hasHttpAuthenticationCredential, innerProcessResult, innerRejectProcess, isRunning, isSuccess, process, report, setBeanName, setEnabled, setRecoveryCheckpoint, setShouldProcessRule, start, startCheckpoint, stop, toCheckpointJson
    • Field Detail

      • CONTENT_SIZE_ANNOTATION_PREFIX

        public static final String CONTENT_SIZE_ANNOTATION_PREFIX
        Prefix associated with annotations made by this processor.
        See Also:
        Constant Field Values
    • Constructor Detail

      • ContentSizeAnnotationPostProcessor

        public ContentSizeAnnotationPostProcessor()
        Constructor.
        See Also:
        Processor
    • Method Detail

      • innerProcess

        protected void innerProcess​(org.archive.modules.CrawlURI crawlURI)
                             throws InterruptedException
        For each URI with a successful status code (status code > 0), add annotation with content size.
        Specified by:
        innerProcess in class org.archive.modules.Processor
        Parameters:
        crawlURI - URI to add annotation for if successful.
        Throws:
        ArgumentNotValid - if crawlURI is null.
        InterruptedException - never.
        See Also:
        Processor
      • shouldProcess

        protected boolean shouldProcess​(org.archive.modules.CrawlURI arg0)
        Specified by:
        shouldProcess in class org.archive.modules.Processor