Package org.archive.crawler.framework
Class ToeThread
- java.lang.Object
-
- java.lang.Thread
-
- org.archive.crawler.framework.ToeThread
-
- All Implemented Interfaces:
Runnable
,org.archive.io.SinkHandlerLogThread
,org.archive.modules.fetcher.HostResolver
,org.archive.modules.ProcessorChain.ChainStatusReceiver
,org.archive.util.ProgressStatisticsReporter
,org.archive.util.Reporter
public class ToeThread extends Thread implements org.archive.util.Reporter, org.archive.util.ProgressStatisticsReporter, org.archive.modules.fetcher.HostResolver, org.archive.io.SinkHandlerLogThread, org.archive.modules.ProcessorChain.ChainStatusReceiver
One "worker thread"; asks for CrawlURIs, processes them, repeats unless told otherwise.- Author:
- Gordon Mohr
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ToeThread.Step
-
Nested classes/interfaces inherited from class java.lang.Thread
Thread.State, Thread.UncaughtExceptionHandler
-
-
Field Summary
-
Fields inherited from class java.lang.Thread
MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
-
-
Constructor Summary
Constructors Constructor Description ToeThread(org.archive.crawler.framework.ToePool g, int sn)
Create a ToeThread
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
atProcessor(org.archive.modules.Processor proc)
org.archive.crawler.framework.CrawlController
getController()
Get the CrawlController acossiated with this thread.String
getCurrentProcessorName()
int
getSerialNumber()
Object
getStep()
boolean
isActive()
Is this thread validly processing a URI, not paused, waiting for a URI, or interrupted?protected void
kill()
Terminates a thread.void
progressStatisticsLegend(PrintWriter writer)
void
progressStatisticsLine(PrintWriter writer)
static void
reportThread(Thread t, PrintWriter pw)
void
reportTo(PrintWriter pw)
Compiles and returns a report on its status.InetAddress
resolve(String host)
void
retire()
Request that this thread retire (exit cleanly) at the earliest opportunity.void
run()
(non-Javadoc)void
setStep(ToeThread.Step s, String procName)
String
shortReportLegend()
String
shortReportLine()
void
shortReportLineTo(PrintWriter w)
Map<String,Object>
shortReportMap()
boolean
shouldRetire()
Whether this thread should cleanly retire at the earliest opportunity.-
Methods inherited from class java.lang.Thread
activeCount, checkAccess, clone, countStackFrames, currentThread, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, onSpinWait, resume, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, suspend, toString, yield
-
-
-
-
Method Detail
-
run
public void run()
(non-Javadoc)- Specified by:
run
in interfaceRunnable
- Overrides:
run
in classThread
- See Also:
Thread.run()
-
setStep
public void setStep(ToeThread.Step s, String procName)
- Parameters:
s
-
-
atProcessor
public void atProcessor(org.archive.modules.Processor proc)
- Specified by:
atProcessor
in interfaceorg.archive.modules.ProcessorChain.ChainStatusReceiver
-
getSerialNumber
public int getSerialNumber()
- Specified by:
getSerialNumber
in interfaceorg.archive.io.SinkHandlerLogThread
- Returns:
- Return toe thread serial number.
-
getController
public org.archive.crawler.framework.CrawlController getController()
Get the CrawlController acossiated with this thread.- Returns:
- Returns the CrawlController.
-
kill
protected void kill()
Terminates a thread.Calling this method will ensure that the current thread will stop processing as soon as possible (note: this may be never). Meant to 'short circuit' hung threads.
Current crawl uri will have its fetch status set accordingly and will be immediately returned to the frontier.
As noted before, this does not ensure that the thread will stop running (ever). But once evoked it will not try and communicate with other parts of crawler and will terminate as soon as control is established.
-
getStep
public Object getStep()
- Returns:
- Current step (For debugging/reporting, give abstract step where this thread is).
-
isActive
public boolean isActive()
Is this thread validly processing a URI, not paused, waiting for a URI, or interrupted?- Returns:
- whether thread is actively processing a URI
-
retire
public void retire()
Request that this thread retire (exit cleanly) at the earliest opportunity.
-
shouldRetire
public boolean shouldRetire()
Whether this thread should cleanly retire at the earliest opportunity.- Returns:
- True if should retire.
-
reportTo
public void reportTo(PrintWriter pw)
Compiles and returns a report on its status.- Specified by:
reportTo
in interfaceorg.archive.util.Reporter
- Parameters:
pw
- Where to print.
-
reportThread
public static void reportThread(Thread t, PrintWriter pw)
- Parameters:
t
- Threadpw
- PrintWriter
-
shortReportMap
public Map<String,Object> shortReportMap()
- Specified by:
shortReportMap
in interfaceorg.archive.util.Reporter
-
shortReportLineTo
public void shortReportLineTo(PrintWriter w)
- Specified by:
shortReportLineTo
in interfaceorg.archive.util.Reporter
- Parameters:
w
- PrintWriter to write to.
-
shortReportLegend
public String shortReportLegend()
- Specified by:
shortReportLegend
in interfaceorg.archive.util.Reporter
-
shortReportLine
public String shortReportLine()
-
progressStatisticsLine
public void progressStatisticsLine(PrintWriter writer)
- Specified by:
progressStatisticsLine
in interfaceorg.archive.util.ProgressStatisticsReporter
-
progressStatisticsLegend
public void progressStatisticsLegend(PrintWriter writer)
- Specified by:
progressStatisticsLegend
in interfaceorg.archive.util.ProgressStatisticsReporter
-
getCurrentProcessorName
public String getCurrentProcessorName()
- Specified by:
getCurrentProcessorName
in interfaceorg.archive.io.SinkHandlerLogThread
-
resolve
public InetAddress resolve(String host)
- Specified by:
resolve
in interfaceorg.archive.modules.fetcher.HostResolver
-
-