Class ToeThread

  • All Implemented Interfaces:
    java.lang.Runnable, org.archive.io.SinkHandlerLogThread, org.archive.modules.fetcher.HostResolver, org.archive.modules.ProcessorChain.ChainStatusReceiver, org.archive.util.ProgressStatisticsReporter, org.archive.util.Reporter

    public class ToeThread
    extends java.lang.Thread
    implements org.archive.util.Reporter, org.archive.util.ProgressStatisticsReporter, org.archive.modules.fetcher.HostResolver, org.archive.io.SinkHandlerLogThread, org.archive.modules.ProcessorChain.ChainStatusReceiver
    One "worker thread"; asks for CrawlURIs, processes them, repeats unless told otherwise.
    Author:
    Gordon Mohr
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  ToeThread.Step  
      • Nested classes/interfaces inherited from class java.lang.Thread

        java.lang.Thread.State, java.lang.Thread.UncaughtExceptionHandler
    • Field Summary

      • Fields inherited from class java.lang.Thread

        MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
    • Constructor Summary

      Constructors 
      Constructor Description
      ToeThread​(org.archive.crawler.framework.ToePool g, int sn)
      Create a ToeThread
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void atProcessor​(org.archive.modules.Processor proc)  
      org.archive.crawler.framework.CrawlController getController()
      Get the CrawlController acossiated with this thread.
      java.lang.String getCurrentProcessorName()  
      int getSerialNumber()  
      java.lang.Object getStep()  
      boolean isActive()
      Is this thread validly processing a URI, not paused, waiting for a URI, or interrupted?
      protected void kill()
      Terminates a thread.
      void progressStatisticsLegend​(java.io.PrintWriter writer)  
      void progressStatisticsLine​(java.io.PrintWriter writer)  
      static void reportThread​(java.lang.Thread t, java.io.PrintWriter pw)  
      void reportTo​(java.io.PrintWriter pw)
      Compiles and returns a report on its status.
      java.net.InetAddress resolve​(java.lang.String host)  
      void retire()
      Request that this thread retire (exit cleanly) at the earliest opportunity.
      void run()
      (non-Javadoc)
      void setStep​(ToeThread.Step s, java.lang.String procName)  
      java.lang.String shortReportLegend()  
      java.lang.String shortReportLine()  
      void shortReportLineTo​(java.io.PrintWriter w)  
      java.util.Map<java.lang.String,​java.lang.Object> shortReportMap()  
      boolean shouldRetire()
      Whether this thread should cleanly retire at the earliest opportunity.
      • Methods inherited from class java.lang.Thread

        activeCount, checkAccess, clone, countStackFrames, currentThread, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, onSpinWait, resume, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, suspend, toString, yield
      • Methods inherited from class java.lang.Object

        equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
      • Methods inherited from interface org.archive.io.SinkHandlerLogThread

        getName
    • Constructor Detail

      • ToeThread

        public ToeThread​(org.archive.crawler.framework.ToePool g,
                         int sn)
        Create a ToeThread
        Parameters:
        g - ToeThreadGroup
        sn - serial number
    • Method Detail

      • run

        public void run()
        (non-Javadoc)
        Specified by:
        run in interface java.lang.Runnable
        Overrides:
        run in class java.lang.Thread
        See Also:
        Thread.run()
      • atProcessor

        public void atProcessor​(org.archive.modules.Processor proc)
        Specified by:
        atProcessor in interface org.archive.modules.ProcessorChain.ChainStatusReceiver
      • getSerialNumber

        public int getSerialNumber()
        Specified by:
        getSerialNumber in interface org.archive.io.SinkHandlerLogThread
        Returns:
        Return toe thread serial number.
      • getController

        public org.archive.crawler.framework.CrawlController getController()
        Get the CrawlController acossiated with this thread.
        Returns:
        Returns the CrawlController.
      • kill

        protected void kill()
        Terminates a thread.

        Calling this method will ensure that the current thread will stop processing as soon as possible (note: this may be never). Meant to 'short circuit' hung threads.

        Current crawl uri will have its fetch status set accordingly and will be immediately returned to the frontier.

        As noted before, this does not ensure that the thread will stop running (ever). But once evoked it will not try and communicate with other parts of crawler and will terminate as soon as control is established.

      • getStep

        public java.lang.Object getStep()
        Returns:
        Current step (For debugging/reporting, give abstract step where this thread is).
      • isActive

        public boolean isActive()
        Is this thread validly processing a URI, not paused, waiting for a URI, or interrupted?
        Returns:
        whether thread is actively processing a URI
      • retire

        public void retire()
        Request that this thread retire (exit cleanly) at the earliest opportunity.
      • shouldRetire

        public boolean shouldRetire()
        Whether this thread should cleanly retire at the earliest opportunity.
        Returns:
        True if should retire.
      • reportTo

        public void reportTo​(java.io.PrintWriter pw)
        Compiles and returns a report on its status.
        Specified by:
        reportTo in interface org.archive.util.Reporter
        Parameters:
        pw - Where to print.
      • reportThread

        public static void reportThread​(java.lang.Thread t,
                                        java.io.PrintWriter pw)
        Parameters:
        t - Thread
        pw - PrintWriter
      • shortReportMap

        public java.util.Map<java.lang.String,​java.lang.Object> shortReportMap()
        Specified by:
        shortReportMap in interface org.archive.util.Reporter
      • shortReportLineTo

        public void shortReportLineTo​(java.io.PrintWriter w)
        Specified by:
        shortReportLineTo in interface org.archive.util.Reporter
        Parameters:
        w - PrintWriter to write to.
      • shortReportLegend

        public java.lang.String shortReportLegend()
        Specified by:
        shortReportLegend in interface org.archive.util.Reporter
      • progressStatisticsLine

        public void progressStatisticsLine​(java.io.PrintWriter writer)
        Specified by:
        progressStatisticsLine in interface org.archive.util.ProgressStatisticsReporter
      • progressStatisticsLegend

        public void progressStatisticsLegend​(java.io.PrintWriter writer)
        Specified by:
        progressStatisticsLegend in interface org.archive.util.ProgressStatisticsReporter
      • getCurrentProcessorName

        public java.lang.String getCurrentProcessorName()
        Specified by:
        getCurrentProcessorName in interface org.archive.io.SinkHandlerLogThread
      • resolve

        public java.net.InetAddress resolve​(java.lang.String host)
        Specified by:
        resolve in interface org.archive.modules.fetcher.HostResolver