|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
JobGenerator
implementations.Pattern
).
ComponentLifeCycle
.
AggregationWorker
singleton contains the schedule and file
bookkeeping functionality needed in the aggregation of indexes.AggregationWorker
inside
Jetty.Constants.ARCDIRECTORY_NAME
.
DateFormat
object.
DateFormat
objects.ArchiveFileNaming
.ArcRepositoryClient.store(File)
.
LegacyHarvestReport
, but is intended to be used with a crawl order
that sets budget using "queue-total-budget" instead of the QuotaEnforcer
(@see HarvesterSettings.OBJECT_LIMIT_SET_BY_QUOTA_ENFORCER
).BnfHeritrixController
.CrawlProgressMessage
instance.
AbstractJobGenerator.canAccept(Job, DomainConfiguration)
.
HarvesterSettings.JOB_TIMEOUT_TIME
setting.
ComponentLifeCycle
better control over the component startup
and shutdown phases.FileBatchJob.processOnlyFilesMatching(String)
construct.HarvestChannel
object in the storage backend.
ShowUnusedConfigurations
boolean, which is flipped.
ShowUnusedSeedLists
boolean, which is flipped.
MetadataFileWriter
for ARC output.
MetadataFileWriter
for WARC output.
Settings.SETTINGS_FILE_PROPERTY
is not set.
Document
instance.
DomainConfiguration
s, performs the
following operations:
Edit the harvest template to add/remove deduplicator configuration.
HarvestMonitor
.FrontierReportLine
.HarvestMonitor
.HarvestReport
implementation defined by the setting
HarvesterSettings.HARVEST_REPORT_CLASS
.
DateFormat
object for ARC date conversion.
File
getArchiveFile() -
Method in class dk.netarkivet.common.utils.archive.HeritrixArchiveHeaderWrapper
HarvestChannel
by its UID.
HarvestChannel
by its unique name.
HarvestChannel
mapped to the given HarvestDefinition
id.
Date
object.
HarvestChannel
for the given type of harvest.
AbstractJobGenerator.DOMAIN_CONFIG_SUBSET_SIZE
configurations that are scanned at each iteration.
HarvesterRegistrationRequest
s.
HarvesterRegistrationResponse
s.
Set
of header keys.
Map
of all header key/value pairs.
HarvestChannelDAO
singleton.
ArchiveFileNaming
implementation defined by the setting
settings.harvester.harvesting.heritrix.archiveNaming.class .
HeritrixLauncher
implementation defined by the setting
dk.netarkivet.harvester.harvesting.heritrixLauncher.class .
JobGenerator
implementation defined by the setting HarvesterSettings.JOBGEN_CLASS
.
DomainStats
object for that
domain, and if not found creates one with zero values.
#SNAPSHOT
singleton.
DateFormat
object for WARC date conversion.
HarvesterRegistrationResponse
s.
CrawlProgressMessage
s.
HarvestReport
interface to be used.
HarvestChannel
instances.HarvestChannel
instances.HarvesterRegistrationRequest
s that have been
received per channel, which allows to know if a HarvestController
s are
registered to a given HarvestChannel
.HarvestControllerServer
periodically sends
HarvesterReadyMessage
s to the JobDispatcher
to notify
it whether it is available for processing a job or already processing one.HarvestController
at startup, to check if the channel name
it has been assigned is valid (e.g.HarvesterStatusReceiver
after processing a
HarvesterRegistrationRequest
message.HarvestJobManager
s lifecycle.
HarvestJobManager
application.CrawlProgressMessage
s on the proper JMS channel, and
stores information to be presented in the monitoring console.HarvestReport
.HeritrixLauncher
.BnfHeritrixController
insteadHarvestSchedulerMonitorServer
to the
HarvestMonitor
to notify it that a job ended and should not be
monitored anymore, and that any resource used to monitor this job
should be freed.DefaultJobGenerator
or FixedDomainConfigurationCountJobGenerator
.
FixedDomainConfigurationCountJobGenerator
,
then this parameter toggles whether or not domain configurations with a budget of zero
(byte or objects) should be excluded from jobs.
FixedDomainConfigurationCountJobGenerator
,
then this parameter represents the maximum number of domain configurations
in a partial harvest job.
FixedDomainConfigurationCountJobGenerator
,
then this parameter represents the maximum number of domain configurations
in a full harvest job.
JobGenerator
.JobSupervisor.start()
for details.LinkedHashMap
.HarvestJobManager
.
IndexAggregator
.
HarvesterSettings.METADATA_GENERATE_ARCHIVE_FILES_REPORT
is set to true, sets the header of the
generated report file.
HarvesterSettings.METADATA_GENERATE_ARCHIVE_FILES_REPORT
is set to true, sets the name of the
generated report file.
ARCRecordToSearchResultAdapter
ArchiveFilesReportGenerator.ArchiveFileStatus
instance.
ScheduledThreadPoolExecutor
, allowing to
periodically run one or several Runnable
tasks
(fixed rate execution).ReplicaCacheDatabase
.DomainnameQueueAssignmentPolicy
where domainname returned is the domainname of the candidateURI
except where the domainname of the SeedURI is a different one.HarvesterReadyMessage
to the JobDispatcher
.
HarvestChannel
name.
start()
method.
BitSet.size()
this does not return the
size in bytes used to represent this set.
ComponentLifeCycle
object.
StartedJobInfo
record to the persistent storage.
StartedJobInfo
record to the persistent storage.
HarvestChannel
object in the storage backend.
HarvesterDatabaseTables
to the required
version.
CommonSettings.REMOTE_FILE_CLASS
.
Constants.WARCDIRECTORY_NAME
.
DateFormat
object.
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |