All Classes Interface Summary Class Summary Enum Summary Exception Summary
Class |
Description |
AbstractHarvestReport |
Base implementation for a harvest report.
|
AbstractRemoteFile |
Abstract superclass for easy implementation of remote file.
|
AbstractRestHeritrixController |
Abstract base class for REST-based Heritrix controllers.
|
AbstractRestHeritrixController.LaunchResultHandler |
Implementation of a LaunchResultHandler for Heritrix3.
|
AcceptLanguageParser |
HTTP accept-language header state machine based parser.
|
AcceptLanguageParser.AcceptLanguage |
Parsed language, country, locale and qvalue.
|
AcceptLanguageParser.AcceptLanguageComparator |
|
AccessBitarchiveApplication |
This class is used to start the BitArchive application.
|
AccessBitarchiveServer |
Bitarchive container responsible for processing the different classes of message which can be received by a
bitarchive and returning appropriate data.
|
ActiveBitPreservation |
All bitpreservation implementations are assumed to have access to admin data and bitarchives.
|
ActiveBitPreservationFactory |
A factory for the ActiveBitPreservation interface.
|
Admin |
The interface for the administration of the storage.
|
AdminData |
Deprecated.
|
AdminDataMessage |
Class encapsulating a request to update AdminData.
|
AdminFactory |
Factory class for the admin instance.
|
AggregationWorker |
The AggregationWorker singleton contains the schedule and file bookkeeping functionality needed in the
aggregation of indexes.
|
AggregatorApplication |
|
AliasInfo |
Class encapsulating domain alias information.
|
AllDocsCollector |
Simple Collector to collect all results from Lucene query.
|
ApacheHttpClientReaderFactory |
|
Application |
The application entity in the deploy hierarchy.
|
ApplicationUtils |
This class provides functionality for starting applications.
|
ARCArchiveAccess |
The ARCArchiveAccess class implements reading of ARC indexes and files.
|
ARCBatchFilter |
A filter class for batch entries.
|
ARCBatchJob |
Abstract class defining a batch job to run on a set of ARC files.
|
ARCFilenameCDXRecordFilter |
A filter to use in CDXReader when finding CDXRecords matching a filename-pattern.
|
ArchiveBatchFilter |
A filter class for batch entries.
|
ArchiveBatchJob |
Abstract class defining a batch job to run on a set of ARC/WARC files.
|
ArchiveBatchJobBase |
|
ArchiveDateConverter |
Utility class for dispensing ARC/WARC DateFormat objects.
|
ArchiveDateConverter |
TODO merge with dk.netarkivet.common.utils.archive.ArchiveDateConverter
|
ArchiveDBConnection |
This class handles connections to the Archive database
|
ArchiveExtractCDX |
Command line tool for extracting CDX information from given ARC/WARC files.
|
ArchiveExtractCDXJob |
Batch job that extracts information to create a CDX file.
|
ArchiveFile |
This class represents a file in the arcrepository which may be indexed by the indexer.
|
ArchiveFileDAO |
Data Access Object for ArchiveFile instances.
|
ArchiveFilenameParser |
|
ArchiveFileNaming |
Interface for a class that implement archiveFileNaming.
|
ArchiveFileNamingFactory |
|
ArchiveHeaderBase |
Utility class for presenting the same interface record header API for both ARC and WARC record headers.
|
ArchiveMessage |
Common base class for messages exchanged between an archive server and an archive client (or within an archive).
|
ArchiveMessageHandler |
This default message handler shields of all unimplemented methods from the ArchiveMessageVisitor interface.
|
ArchiveMessageVisitor |
Interface for all classes which handles archive-related messages received from a JMS server.
|
ArchiveProfile |
Assemble the constants related to an archive format into profiles.
|
ArchiveRecordBase |
Base class for unified ARC/WARC record API:
|
ArchiveSettings |
Settings specific to the archive module of NetarchiveSuite.
|
ArchiveStoreState |
This class contains a storestate, and the time, when it was last set.
|
ARCKey |
Represents a location key in the ARC format.
|
ARCLookup |
This class allows lookup of URLs in the ArcRepository, using full Lucene indexes to find offsets.
|
ArcMerge |
Command line tool for merging several ARC files into a single ARC file.
|
ArcRepository |
The Arcrepository handles the communication with the different replicas.
|
ArcRepositoryApplication |
This class is used to start the ArcRepository application.
|
ArcRepositoryClient |
Generic interface defining all methods that an ArcRepository provides.
|
ArcRepositoryClientFactory |
A factory for ArcRepositoryClients.
|
ArcRepositoryEntry |
This class contains the information that we keep about each file in the arcrepository: Checksum and the store states
for all bitarchives.
|
ArcRepositoryServer |
Listens on the queue "TheArcrepos" and submits the messages to a corresponding visit method on BitarchiveClient.
|
ARCTestUtils |
|
ARCUtils |
Various utilities that do stuff that ARCWriter does not provide.
|
ArcWrap |
Command line tool for creating an ARC file from given data.
|
ArgumentNotValid |
Indicates that one or more arguments are invalid.
|
BasicTwoWaySSLProvider |
Class for loading certificates and keys from a key- and truststore
and configuring an Apache HTTP Registry to use these.
|
BatchEndedMessage |
An instance of this class is sent by a bitarchive machine (Bitarchive class) to a BitarchiveMonitorServer to indicate
that that single machine has finished processing a batch job.
|
BatchExecuter |
Class for execution of a batchjob in a separate thread.
|
BatchFileType |
Enumerator for the different types of files the batchjob can be executed upon.
|
BatchGUI |
Utility class for creating the web page content for the batchjob pages.
|
BatchLocalFiles |
Class for running FileBatchJobs on a set of local files.
|
BatchMessage |
Container for batch jobs.
|
BatchReplyMessage |
Message class used by the bit archive monitor to notify the ArcRepository of a completed batch job.
|
BatchStatus |
Class for transferring batch status information.
|
BatchTermination |
Exception to tell running batchjobs to terminate.
|
BatchTerminationMessage |
Message for telling the bitarchives to terminate a specific batchjob.
|
BinSearch |
Performs a binary search through .cdx files for a given prefix string.
|
Bitarchive |
The central class in the bit archive.
|
BitarchiveAdmin |
This class handles file lookup and encapsulates the actual placement of files.
|
BitarchiveApplication |
This class is used to start the BitArchive application.
|
BitarchiveARCFile |
The representation of an ARC file in the bit archive.
|
BitarchiveClient |
Proxy for remote bitarchive.
|
BitarchiveMonitor |
Class representing the monitor for bitarchives.
|
BitarchiveMonitorApplication |
This class is used to start the BitarchiveMonitor application.
|
BitarchiveMonitorServer |
Class representing message handling for the monitor for bitarchives.
|
BitarchiveRecord |
Class to hold the result of a lookup operation in the bitarchive: The metadata information associated with the record
The actual byte content The name of the file the data were retrieved from If length of record exceeds value of
Settings.BITARCHIVE_LIMIT_FOR_RECORD_DATATRANSFER_IN_FILE The record is stored in a RemoteFile.
|
BitarchiveServer |
Bitarchive container responsible for processing the different classes of message which can be received by a
bitarchive and returning appropriate data.
|
BitmagArcRepositoryClient |
Client side usage of an arc repository.
|
BitmagUtils |
Utility class to abstract away the specifics of setting up and obtaining bitrepository.org clients.
|
BitPreservationDAO |
This is an interface for communicating with bitpreservation databases.
|
BitPreservationSiteSection |
Site section that creates the menu for bit preservation.
|
BitpreservationUpdateThread |
Class for threading a bitpreservation update into a thread.
|
BitpreservationUpdateType |
The available bitpreservation update types.
|
BitpreserveFileState |
Class encapsulating methods for handling web requests for ActiveBitPreservation.
|
BlockingCommandLauncher |
|
BnfHarvestReport |
|
BuildCompleteSettings |
Class for combining the different setting files into a complete settings file.
|
ByteClassLoader |
A subclass of ClassLoader that can take a byte[] containing a class file.
|
ByteJarLoader |
ByteJarLoader is a ClassLoader that stores java classes in a map where the key to the map is the class name, and the
value is the class stored as a byte array.
|
CachingProxyConnectionFactory |
Adds caching to another JMXProxyFactoryConnectionFactory.
|
CachingSLF4JAppender |
SLF4J appender that caches a certain number of log entries in a cyclic manor.
|
CachingSLF4JLogRecord |
Cached log entry and JMX bean in one.
|
CDXDataCache |
A RawDataCache that serves files with CDX data.
|
CDXIndexCache |
A cache that serves CDX index files for job IDs.
|
CDXIndexer |
Class for creating CDX indexes from archive files.
|
CDXMapper |
Hadoop Mapper for creating the CDX indexes.
|
CDXOriginCrawlLogIterator |
This subclass of CrawlLogIterator adds the layer of digging an origin of the form "arcfile,offset" out of a
corresponding CDX index.
|
CDXReader |
This class handles reading CDX files and finding entries in them.
|
CDXRecord |
Represents a line i a CDX-file.
|
CDXRecordFilter |
Interface defining a filter to use in CDXReader when finding CDXRecords.
|
CDXUtils |
Utility class for creating CDX-files.
|
CGIRequestBuilder |
Abstraction layer above the Apache RequestBuilder to use for building requests to interact with cgi-services
|
ChannelID |
A class for representing the names of JMS queues.
|
Channels |
This singleton class is in charge of giving out the correct channels.
|
CheckDomainCrawltraps |
Checks DomainCrawltraps in the Domain table for validity.
|
ChecksumArchive |
This abstract class is the interface for the checksum archives, which can be one of the following:
- FileChecksumArchive where the archive is placed in a single file.
|
ChecksumArchiveFactory |
|
ChecksumArchiveServer |
Any subclass must be invoked through a method called 'getInstance'.
|
ChecksumCalculator |
Calculates MD5 or SHA1 checksums on files using the built-in Java methods.
|
ChecksumClient |
Proxy for remote checksum archive.
|
ChecksumFileApplication |
This class is used to start the checksum file application.
|
ChecksumFileServer |
The server for the ChecksumFileApplication.
|
ChecksumJob |
Class responsible for checksumming a list of files.
|
ChecksumStatus |
The status of the checksum for the bitpreservation database.
|
CheckTrapsInFile |
Test all strings in the file argument for XML-wellformedness.
|
ClassAsserts |
Utility class containing various method for making assertions on Class objects.
|
CleanupHook |
Defines a ShutdownHook for a class which has a cleanup method.
|
CleanupIF |
Interface for classes which can be cleaned up by a shutdown hook.
|
ClientAction |
Simple interface for grouping the various actions that the client can perform
|
CollectionAsserts |
Utilities for doing asserts on collections.
|
CollectionPrefixNamingConvention |
Implements another way of prefixing archive files in Netarchivesuite.
|
CollectionUtils |
Miscellaneous utility functions for collections that Sun neglected to make.
|
CombiningMultiFileBasedCache<T extends Comparable<T>> |
This class provides the framework for classes that cache the effort of combining multiple files into one.
|
CommandLineParser |
Print DigestIndexer command-line usage message.
|
CommandResolver |
An abstract superclass for URIResolvers that handle commands given to the server host
(http://< >/<>/<>=<>*).
|
CommonSettings |
Settings common to the entire NetarchiveSuite.
|
ComponentLifeCycle |
Extends the default construction -> deconstruction object life cycle with addition steps, giving users of
ComponentLifeCycle better control over the component startup and shutdown phases.
|
ConfigResource |
|
ConfigTemplateBuilder |
|
Constants |
Constants used in bit preservation.
|
Constants |
Constants for the Archive module.
|
Constants |
Constants for the bitarchive webinterface.
|
Constants |
This class is used for global constants only.
|
Constants |
Contains the constants for the management classes.
|
Constants |
Class containing the constant variables.
|
Constants |
Constants for harvester module.
|
Constants |
Constants used by the datamodel and webinterface packages.
|
Constants |
Constants used by the frontier package.
|
Constants |
Constants for heritrix3-controller module.
|
Constants |
Harvester webinterface constants.
|
Constants |
Constants for the Monitor module.
|
Constants |
Constants for the Viewerproxy module.
|
Constants |
Viewerproxy webinterface constants.
|
ContentAttribute_Generic |
Generic EAV attribute.
|
ContentAttrType_Generic |
Generic EAV attribute type.
|
ContentSizeAnnotationPostProcessor |
A post processor that adds an annotation
content-size:
for each successfully harvested URI.
|
Controller |
The API for controlling the viewerproxy.
|
CookieUtils |
|
CookieUtils.Lifespan |
Some cookie lifespan to play with.
|
CorrectMessage |
The message to correct a bad entry in an archive.
|
CrawlDataItem |
A base class for individual items of crawl data that should be added to the index.
|
CrawlDataIterator |
An abstract base class for implementations of iterators that iterate over different sets of crawl data (i.e.
|
CrawlertrapsUtils |
Some utilities for validation of crawlertraps.
|
CrawlLogDataCache |
This class implements the low-level cache for crawl log Lucene indexing.
|
CrawlLogExtractionMapper |
Hadoop Mapper for extracting crawllog lines from metadata files.
|
CrawlLogExtractionStrategy |
Strategy to give a HadoopJob when wanting to extract crawl log lines matching some regex from metadata files.
|
CrawlLogIndexCache |
A cache that serves Lucene indices of crawl logs for given job IDs.
|
CrawlLogIterator |
An implementation of a CrawlDataIterator capable of iterating over a Heritrix's style
crawl.log .
|
CrawlLogLinesMatchingRegexp |
Batchjob that extracts lines from a crawl log matching a regular expression The batch job should be restricted to run
on metadata files for a specific job only, using the FileBatchJob.processOnlyFilesMatching(String) construct.
|
CrawlProgressMessage |
This class wraps information stored in the Heritrix MBeans, CrawlService and CrawlService.Job, and represents the
crawl progress.
|
CrawlProgressMessage.CrawlStatus |
The general status of a job in NAS.
|
CrawlStatusMessage |
Instances of this class are sent by a HarvestControllerServer to the THE_SCHED queue to indicate the progress of a
heritrix crawl.
|
CreateCDXMetadataFile |
This tool creates a CDX metadata file for a given job's jobID and harvestPrefix by running a batch job on the
bitarchive and processing the results to give a metadata file.
|
CreateIndex |
A tool to ask indices from indexserver on demand.
|
CreateTestInstance |
This class applies the test variables.
|
DailyFrequency |
This class implements a frequency of a number of days.
|
DAO |
Interface common to all DAO's.
|
DAOProviderFactory |
|
DatabaseAdmin |
The administrator class for the ArcRepository when dealing with an database instead of a file (alternative to
AdminData).
|
DatabaseBasedActiveBitPreservation |
The database based active bit preservation.
|
DatabaseChecksumArchive |
A ChecksumArchive persisted with a Berkeley DB JE Database.
|
DatabasePreservationState |
This class contains the preservation data based on the database data of a given filename.
|
DatabaseTestUtils |
Utilities to allow testing databases.
|
DatedFileListJob |
Job which returns the names of all files in the archive modified after a specific date.
|
DBSpecifics |
Abstract collection of DB methods that are not standard SQL.
|
DBSpecifics |
Defines database specific implementations used by the Harvester.
|
DBUtils |
Various database related utilities.
|
DedupAttributeConstants |
Lifted from H1 AdaptiveRevisitAttributeConstants and limited to what DeDuplicator was using.
|
DedupCrawlLogIndexCache |
A cache of crawl log indices appropriate for the Icelandic deduplicator code, excluding all text entries.
|
DedupIndexer |
|
DeduplicateToCDXAdapter |
Class containing methods for turning duplicate entries in a crawl log into lines in a CDX index file.
|
DeduplicateToCDXAdapterInterface |
Interface describing a class which can be used to convert duplicate records in a crawl log to wayback-compatible cdx
records.
|
DeduplicateToCDXApplication |
A simple command line application to generate cdx files from local crawl-log files.
|
DeduplicationCDXExtractionBatchJob |
This batch batch job takes deduplication records from a crawl log in a metadata arcfile and converts them to cdx
records for use in wayback.
|
DeDuplicator |
Heritrix compatible processor.
|
DeDuplicator.AnalysisMode |
|
DeDuplicator.FilterMode |
|
DeDuplicator.MatchingMethod |
|
DeDuplicator.OriginHandling |
|
DefaultFreeSpaceProvider |
Default Free Space Provider of the number of bytes free on the file system.
|
DefaultJobGenerator |
The legacy job generator implementation.
|
DefaultJobGenerator.CompareConfigsDesc |
Compare two configurations using the following order: 1) Harvest template 2) Byte limit 3) expected number of
object a harvest of the configuration will produce.
|
DefinitionsSiteSection |
Site section that creates the menu for data definitions.
|
DelegatingController |
Control of viewer proxy.
|
DeployApplication |
The application that is run to generate install and start/stop scripts for all physical locations, machines and
applications.
|
DeployConfiguration |
The structure for the deploy-config.
|
DerbyEmbeddedSpecifics |
A class that implement functionality specific to the embedded Derby system.
|
DerbyEmbeddedSpecifics |
A class that implement functionality specific to the embedded Derby system.
|
DerbyServerSpecifics |
Implementation of DB-specific functions for the server-based Derby.
|
DerbyServerSpecifics |
Implementation of DB-specific functions for the server-based Derby.
|
DerbySpecifics |
Derby-specific implementation of DB methods.
|
DerbySpecifics |
Derby-specific implementation of DB methods.
|
Diff |
Compares two collections, returning a list of the additions, changes, and deletions between them.
|
Difference |
Represents a difference, as used in Diff .
|
DigestIndexer |
A class for building a de-duplication index.
|
DigestIndexerWorker |
This worker class handles the indexing of one single crawl-log and associated cdxfile.
|
DigestOptions |
Encapsulates the options for the indexing process.
|
Domain |
Represents known information about a domain A domain is identified by a domain name (ex: kb.dk)
|
DomainConfiguration |
This class describes a configuration for harvesting a domain.
|
DomainConfigurationDefinition |
Utility class containing methods for processing a GUI-request to update the details of a domain-configuration.
|
DomainDAO |
Persistent storage for Domain objects.
|
DomainDBDAO |
A database-based implementation of the DomainDAO.
|
DomainDefinition |
Utility class for handling update of domain from the domain jsp page.
|
DomainHarvestInfo |
DomainConfigPair class for extracted information on harvests on a specific domain.
|
DomainHistory |
Container for the historical information available for a domain.
|
DomainIngester |
This class manages a thread that ingests (i.e.
|
DomainnameQueueAssignmentPolicy |
Using the domain as the queue-name.
|
DomainOwnerInfo |
This class manages owner information about a domain.
|
DomainSearchType |
Enumeration of the possible ways to search for domains.
|
DomainSeedsDefinition |
Contains utility methods for updating seedlists from the GUI.
|
DomainSeedsDefinition.UrlInfo |
Utility class gathering together data relating to the editing of a seed list.
|
DomainStats |
Tuple class to hold domain harvest statistics for a single domain.
|
DomainStatsReport |
Used together with the HarvestReportGenerator to generate a HarvestReport.
|
DomainUtils |
Utilities for working with domain names.
|
DoOneCrawlMessage |
Container for doOneCrawl request.
|
EAV |
EAV wrapper for the actual EAV implementation.
|
EAV.AttributeAndType |
Handy class to pair an attribute and its type.
|
EMailNotifications |
Delivers NetarchiveSuite notifications by email.
|
EMailUtils |
Utilities for sending an email.
|
EvaluateConfigFile |
Class for evaluating a config file.
|
EventHarvestUtil |
Contains utility methods for supporting event harvest GUI.
|
ExceptionUtils |
Utilities for reading a stacktrace.
|
ExhaustedQueuesFilter |
Filters a frontier report to include only lines that represent exhausted queues.
|
ExportFrontierReportCsvQuery |
UI query to export the frontier report extract as a CSV file.
|
ExportFrontierReportCsvQuery.UI_FIELD |
Defines the UI fields and their default values.
|
ExtendableEntity |
|
ExtendedDNSFetcher |
Processor to resolve 'dns:' URIs.
|
ExtendedField |
This class represents one Extended Field.
|
ExtendedFieldConstants |
Constants primarily used by class ExtendedFieldDefinition and the jsp pages with extendedField functionality.
|
ExtendedFieldDAO |
Interface for creating and accessing extended fields in persistent storage.
|
ExtendedFieldDataTypes |
Constants for the available ExtendedFieldDatatypes.
|
ExtendedFieldDBDAO |
A database-based implementation of the ExtendedFieldDBDAO class.
|
ExtendedFieldDefaultValue |
Class for constructing, validating, and keeping the default value for a single ExtendedField.
|
ExtendedFieldDefinition |
Contains utility methods for creating and editing schedule definitions for harvests.
|
ExtendedFieldOptions |
Class to represent options for Extended Fields.
|
ExtendedFieldType |
This class represents one Extended Field Type.
|
ExtendedFieldTypeDAO |
Interface for creating and accessing extended fields in persistent storage.
|
ExtendedFieldTypeDBDAO |
Implementation of the ExtendedFieldTypeDAO interface for creating and accessing extended fields in persistent
storage.
|
ExtendedFieldTypes |
Class declaring constants for ExtendedFieldTypes and their corresponding table-names.
|
ExtendedFieldValue |
Class for holding a value of one ExtendedField.
|
ExtendedFieldValueDAO |
Interface for creating and accessing extended fields in persistent storage.
|
ExtendedFieldValueDBDAO |
Implementation class for the ExtendedFieldValueDAO interface.
|
ExtendedFieldValueDefinition |
|
ExtendedFTPRemoteFile |
This class extends the functionality of FTPRemoteFile by allowing local input to be taken from an ArchiveRecord.
|
ExtractCDX |
Command line tool for extracting CDX information from given ARC files.
|
ExtractCDXJob |
Batch job that extracts information to create a CDX file.
|
ExtractorOAI |
This is a link extractor for use with Heritrix.
|
FailsOnJenkins |
|
FailsOnLocalAndJenkins |
|
FaultyHarvestControllerServer |
This class responds to JMS doOneCrawl messages from the HarvestScheduler and waits 10 minutes before failing the job.
|
FileArrayIterator<T> |
An iterator that iterates over elements that can be read from files, given an array of files.
|
FileAsserts |
Utility functions for asserting statements about files.
|
FileBasedActiveBitPreservation |
Deprecated.
|
FileBasedCache<T> |
A generic cache that stores items in files.
|
FilebasedFreeSpaceProvider |
File Free Space Provider returns the number of bytes free out of a file.
|
FileBatchJob |
Interface defining a batch job to run on a set of files.
|
FileBatchJob.ExceptionOccurrence |
This class holds the information about exceptions that occurred in a batchjob.
|
FileChecksumArchive |
A checksum archive in the form of a file (as alternative to a database).
|
FileListJob |
A batch job which returns a list of all files in the bitarchive in which it runs.
|
FileListStatus |
The status for the file list updates.
|
FileNameHarvester |
|
FilePreservationState |
This class collects the available bit preservation information for a file.
|
FileRemoteFile |
A file represented as a RemoteFile.
|
FileRemover |
This class implements a batchjob that enables you to delete files from an archive.
|
FileResolver |
|
FileResolverRESTClient |
A FileResolver client to communicate with a service implementing the FileResolver API
e.g.
|
FileUtils |
Misc.
|
FilterIterator<T,S> |
An iterator that filters out and converts items from another iterator.
|
FilterIteratorInterface |
Created by csr on 8/21/15.
|
FindRelevantCrawllogLines |
Find relevant crawllog lines for a specific domain in a specific metadata file
args: domain metadatafile
Note: currently the regexp is embedded in the jsp page harvester/qa-gui/src/main/webapp/QA-searchcrawllog.jsp
but should probably be removed to the Reporting class ./harvester/harvester-core/src/main/java/dk/netarkivet/viewerproxy/webinterface/Reporting.java
|
FindRunningJobQuery |
Represents a query for job IDs that would be set to harvest a given domain.
|
FindRunningJobQuery.UI_FIELD |
Defines the UI fields and their default values.
|
FixedDomainConfigurationCountJobGenerator |
Job generator implementation.
|
FixedUURI |
Fixed UURI which extends UURI to fix an NPE bug in getReferencedHost.
|
ForwardedToErrorPage |
This exception indicates that we have forwarded to a JSP error page and thus should stop all processing and just
return at the top level JSP.
|
FreeSpaceProvider |
This interface encapsulates providing the number of bytes free on the file system.
|
FreeSpaceProviderFactory |
Factory for FreeSpaceProvider.
|
Frequency |
This class defines various frequencies at which things can happen, such as midnight every day, 13:45 the first monday
of a month, etc.
|
FrontierReport |
Common interface for an Heritrix frontier report wrapper.
|
FrontierReportAnalyzer |
Implements the analysis of a full frontier report obtained from Heritrix3, as the execution of a sequence of
user-defined filters, that each generate a smaller, in-memory frontier report that are sent in a JMS message to the
HarvestMonitor .
|
FrontierReportCsvExport |
Utility class implementing the export of a frontier report object to a CSV file.
|
FrontierReportFilter |
Interface for a frontier report filter.
|
FrontierReportLine |
Wraps a line of the frontier report.
|
FrontierReportLineNaturalOrder |
|
FrontierReportMessage |
|
FTPConnectionManager |
|
FTPRemoteFile |
Class encapsulating upload to & download from an ftp-server.
|
FTPValidator |
Tool for testing if a FTP server is NetarchiveSuite compliant.
|
FullCrawlLogIndexCache |
A CrawlLogIndexCache that takes in all entries in the crawl log.
|
FullFrontierReport |
Wraps an Heritrix 1 full frontier report.
|
FullHarvest |
This class contains the specific properties and operations of snapshot harvest definitions.
|
GenericDAO<T,PK extends Serializable> |
A generic class for managing storage and retrieval of persistent objects.
|
GenericHibernateDAO<T,PK extends Serializable> |
An implementation of Generic DAO which is specialised for hibernate object stores.
|
GenericMessageListener |
A bare bones MessageListener used for unit testing.
|
GetAllChecksumsMessage |
The GetChecksumMessage has the purpose to retrieve the checksum of all the files.
|
GetAllFilenamesMessage |
The GetAllFilenamesMessage is sent to retrieve all the filenames in a specific replica.
|
GetCDXRecordsBatchJob |
Job to get cdx records out of metadata files.
|
GetChecksumMessage |
The GetChecksumMessage has the purpose to retrieve the checksum of a specific file.
|
GetDataResolver |
Wrapper for an URIResolver, which retrieves raw data on given specific URLs, and forwards all others to the wrapped
handler.
|
GetFile |
A command-line tool to get ARC files from the bitarchive.
|
GetFileAction |
Action class to get files from Bitmag.
|
GetFileEventHandler |
|
GetFileIDsAction |
Action class to get file IDs from Bitmag.
|
GetFileIDsEventHandler |
|
GetFileMessage |
Message requesting a file from a bitarchive.
|
GetMessage |
Container for get requests.
|
GetMetadataArchiveBatchJob |
A batch job that extracts metadata.
|
GetMetadataMapper |
Hadoop Mapper for extracting metadata entries from metadata files.
|
GetRecord |
A command-line tool to get ARC records from the bitarchive.
|
GlobalCrawlerTrapList |
Class representing one or more global crawler traps, modeled as a set of regular expressions.
|
GlobalCrawlerTrapListDAO |
A Data Access Object for managing persistent collections of global crawler traps.
|
GlobalCrawlerTrapListDBDAO |
A singleton giving access to global crawler traps.
|
GUIApplication |
This class is used to start the GUI web applications server.
|
GUIWebServer |
A class representing an HttpServer.
|
H1HeritrixTemplate |
Class encapsulating the Heritrix order.xml.
|
H3BudgetResource |
|
H3CrawlLogCachedResource |
|
H3FilterResource |
|
H3FrontierDeleteResource |
|
H3FrontierResource |
|
H3HeritrixTemplate |
Class encapsulating the Heritrix crawler-beans.cxml file
|
H3HeritrixTemplate.MetadataInfo |
|
H3JobResource |
|
H3ReportResource |
|
H3ScriptResource |
|
H3ScriptTemplateBuilder |
|
HadoopFileUtils |
Utilities for file actions related to Hadoop.
|
HadoopJob |
Wrapper for a Hadoop job to prepare/handle a job.
|
HadoopJobStrategy |
Interface for a HadoopJob's strategy of how to perform the job.
|
HadoopJobTool |
A simple generic Hadoop map-only tool that runs a given mapper on the passed input file
containing new-line separated file paths and outputs the job's resulting files in the passed output path
|
HadoopJobUtils |
Utilities for Hadoop jobs.
|
HadoopMiniClusterTestCase |
Overhead class for Hadoop tests that need to use a mini cluster.
|
HarvestChannel |
Harvest channels are used to dispatch harvest jobs to specific pools of crawlers.
|
HarvestChannelAction |
Abstract class representing web UI action for harvest channels.
|
HarvestChannelAction.ActionType |
|
HarvestChannelDAO |
Abstract class for the DAO handling the persistence of HarvestChannel instances.
|
HarvestChannelDBDAO |
Implementation class for the DAO handling the persistence of HarvestChannel instances.
|
HarvestChannelMappingServlet |
This class process an Ajax call from the UI to generate a form that allows to map a harvest to a channel.
|
HarvestChannelMappingServlet.Param |
Enumarate HTTP request parameter names.
|
HarvestChannelRegistry |
|
HarvestChannelSiteSection |
Site section that creates the menu for harvest channel and mappings.
|
HarvestControllerApplication |
This application controls the Heritrix3 harvester which does the actual harvesting, and is also responsible for
uploading the harvested data to the ArcRepository.
|
HarvestControllerApplication |
This application starts the FaultyHarvestControllerServer.
|
HarvestControllerServer |
This class responds to JMS doOneCrawl messages from the HarvestScheduler and launches a Heritrix crawl with the
received job description.
|
HarvestdatabaseUpdateApplication |
Utility for updating the harvestdatabase.
|
HarvestDatabaseValidator |
Simple tool to let you verify that you can access the database correctly with
settings defined by -Ddk.netarkivet.settings.file=/fullOrrelative/path/to/settings.xml
or the settings.
|
HarvestDBConnection |
This class handles connections to the harvest definition database, and also defines basic logic for checking versions
of tables.
|
HarvestDefinition |
This abstract class models the general properties of a harvest definition, i.e.
|
HarvestDefinitionDAO |
A Data Access Object for harvest definitions.
|
HarvestDefinitionDBDAO |
A database-oriented implementation of the HarvestDefinitionDAO.
|
HarvestDefinitionInfo |
Class containing Info about a harvestjob.
|
HarvestDocumentation |
This class contains code for documenting a H3 harvest.
|
HarvestedUrlsForDomainBatchJob |
Batchjob that extracts lines referring to a specific domain from a crawl log.
|
HarvesterArcRepositoryClient |
Implements the Facade pattern to shield off the methods in JMSArcRepositoryClient not to be used by the harvest
system.
|
HarvesterChannels |
This singleton class is in charge of giving out the correct channels.
|
HarvesterDatabaseTables |
Enum class defining the tables of the Harvester database and the required versions of the individual tables.
|
HarvesterMessage |
Common base class for messages exchanged between a harvester server and a harvester client.
|
HarvesterMessageHandler |
This default message handler shields of all unimplemented methods from the HarvesterMessageVisitor interface.
|
HarvesterMessageVisitor |
Interface for all classes which handles harvester-related messages received from a JMS server.
|
HarvesterQueueControl |
Use the JMSConnection.createQueueBrowser() method to test number of
entries in a specific Queue.
|
HarvesterReadyMessage |
The HarvestControllerServer periodically sends HarvesterReadyMessage s to the JobDispatcher to
notify it whether it is available for processing a job or already processing one.
|
HarvesterRegistrationRequest |
Message sent by a HarvestController at startup, to check if the channel name it has been assigned is valid
(e.g.
|
HarvesterRegistrationResponse |
|
HarvesterSettings |
Settings specific to the harvester module of NetarchiveSuite.
|
HarvesterStatusReceiver |
Handles the reception of status messages from the harvesters.
|
HarvestHistoryTableHelper |
Used to manage the model used in the domain harvest history page.
|
HarvestInfo |
Summary information about a specific harvest of a domain.
|
HarvestingAbort |
This exception is used to signal that harvest is aborted.
|
HarvestJob |
|
HarvestJobGenerator |
Handles the generation of new jobs based on the harvest definitions in persistent storage.
|
HarvestJobManager |
Handles the dispatching of scheduled harvest to the harvest servers based on the harvests defined in the database.
|
HarvestJobManagerApplication |
|
HarvestMonitor |
Listens for CrawlProgressMessage s on the proper JMS channel, and stores information to be presented in the
monitoring console.
|
HarvestReport |
Base interface for a post-crawl harvest report.
|
HarvestReportFactory |
Factory class for instantiating a specific implementation of HarvestReport .
|
HarvestReportGenerator |
Base implementation for a harvest report.
|
HarvestReportGenerator.ProgressStatisticsConstants |
Strings found in the progress-statistics.log, used to devise the default stop reason for domains.
|
HarvestRunInfo |
Information on a single run of a harvest.
|
HarvestSchedulerMonitorServer |
Submitted harvesting jobs are registered by listening for CrawlStatusMessages on the
THE_SCHED queue and processes completed harvests.
|
HarvestStatus |
This page provides support for the HarvestStatus pages of the web interface.
|
HarvestStatusJobDetails |
Created by jve on 5/31/16.
|
HarvestStatusQuery |
Represents a query for a set of jobs.
|
HarvestStatusQuery.SORT_ORDER |
Enum class defining the different sort-orders.
|
HarvestStatusQuery.UI_FIELD |
Defines the UI fields and their default values.
|
HarvestStatusRunningTablesSort |
Class used to manage the sort of tables in the harvest status running screen.
|
HarvestStatusRunningTablesSort.ColumnId |
list of the column id.
|
HarvestTemplateApplication |
Utility for maintaining harvest-templates from the commandline.
|
HeartBeatMessage |
Simple class representing a HeartBeat message from a bit archive application.
|
HeartBeatSender |
Thread reponsible for sending out periodic HeartBeatMessages.
|
Heritrix1Constants |
|
Heritrix3Files |
This class encapsulates the information generated by Heritrix3 or delivered to Heritrix3 before a crawl.
|
Heritrix3JobMonitor |
|
Heritrix3JobMonitorThread |
|
Heritrix3Settings |
Settings specific to the heritrix3 harvester module of NetarchiveSuite.
|
Heritrix3WrapperManager |
|
HeritrixArchiveHeaderWrapper |
Heritrix wrapper implementation of the abstract archive header interface.
|
HeritrixArchiveRecordWrapper |
Heritrix wrapper implementation of the abstract archive record interface.
|
HeritrixController |
This implementation of the HeritrixController interface starts Heritrix3 as a separate process and uses JMX to
communicate with it.
|
HeritrixControllerFactory |
A factory class for HeritrixController instances.
|
HeritrixFiles |
This class encapsulates all the files that Heritrix gets from our system, and all files we read from Heritrix.
|
HeritrixFiles.Version |
|
HeritrixLauncher |
|
HeritrixLauncherAbstract |
A HeritrixLauncher object wraps around an instance of the web crawler Heritrix3.
|
HeritrixLauncherFactory |
|
HeritrixLaunchException |
This exception is used to signal that the launch of a heritrix has gone wrong,
|
HeritrixTemplate |
Abstract class for manipulating Heritrix Templates.
|
HibernateUtil |
This class contains a single static utility method which returns a Hibernate session: HibernateUtil.getSession().
|
HistoryServlet |
|
HistoryServlet.Resource |
|
HistorySiteSection |
Site section that creates the menu for harvest history.
|
HostEntry |
Helper class to encapsulate information about one remote JmxConnection.
|
HostForwarding<T> |
Handles the forwarding of other hosts' MBeans matching a specific regular query and interface to a given mbean
server.
|
HostNameUtils |
Provide the hostname of the machine on which the program is running.
|
HourlyFrequency |
This class implements a frequency of a number of hours.
|
HTMLUtils |
This is a utility class containing methods for use in the GUI for netarkivet.
|
HTTPControllerClient |
Client side communication with http controller server.
|
HTTPControllerServer |
Wrapper for an URIResolver, which calls the controller methods on given specific URLs, and forwards all others to the
wrapped handler.
|
HttpLocaleHandler |
A class used to determine the appropriate locale/language to use for generating a HTTP response.
|
HttpLocaleHandler.Language |
Object constructed from language configuration in the settings xml file.
|
HTTPRemoteFile |
A remote file implemented with point-to-point HTTP communication.
|
HTTPRemoteFileRegistry |
This is a registry for HTTP remote file, meant for serving registered files to remote hosts.
|
HttpsClientBuilder |
Class for providing configured HTTPS clients to execute requests over SSL.
|
HTTPSRemoteFile |
A remote file implemented with point-to-point HTTPS communication.
|
HTTPSRemoteFileRegistry |
This is a registry for HTTPS remote file, meant for serving registered files to remote hosts.
|
I18n |
Internationalization class.
|
IcelandicExtractorJS |
Processes Javascript files for strings that are likely to be
crawlable URIs.
|
IHeritrixController |
This interface encapsulates the direct access to Heritrix, allowing for accessing in various ways (direct class
access or JMX).
|
IllegalState |
An object was not in the right state for the operation attempted.
|
Index<I> |
An immutable pair if an index and the set this is an index for.
|
IndexAggregator |
Encapsulates the functionality for sorting and merging index files.
|
IndexClientFactory |
A factory for IndexClients.
|
Indexer |
|
IndexerQueue |
Singleton class which maintains the basic data structure and methods for the indexer.
|
IndexingState |
Stores the state of a indexing task.
|
IndexReadyMessage |
A message to send from the IndexServer to HarvestJobManager, that the index required by harvest with a given ID is
ready.
|
IndexRequestClient |
Client for index request server.
|
IndexRequestMessage |
Message for requesting and index from the index server, and for giving back the reply.
|
IndexRequestServer |
Index request server singleton.
|
IndexRequestServerFactory |
|
IndexRequestServerInterface |
An interface for all IndexRequestServer implementations.
|
IndexResource |
|
IndexResource.HarvestChannelStructure |
|
IndexServer |
Index server.
|
IndexServerApplication |
This class is used to start the IndexServer application.
|
IngestableFiles |
Encapsulation of files to be ingested into the archive.
|
IngestDomainList |
Utility class for ingesting new domains into the database.
|
InMemoryFrontierReport |
Implements a frontier report wrapper that is stored in memory.
|
InputStreamUtils |
This class provides various utilities for inputstreams.
|
IntegrityTester |
Integrity tests for the package dk.netarkivet.common.utils.
|
IOFailure |
An input/output operation failed.
|
IteratorUtils |
Various utilities to work with iterators more easily.
|
JMSArcRepositoryClient |
Client side usage of an arc repository.
|
JMSBroker |
Used to check if firewall ports are open and if the JMS broker is up and responding.
|
JMSConnection |
Handles the communication with a JMS broker.
|
JMSConnectionFactory |
Factory for JMS connection.
|
JMSConnectionMockupMQ |
A MockUp message queue, that generates a connection and destinations suitable for testing.
|
JMSConnectionMockupMQ.CallOnMessageThread |
|
JMSConnectionMockupMQ.TestConnection |
|
JMSConnectionMockupMQ.TestConnectionFactory |
|
JMSConnectionMockupMQ.TestDestination |
|
JMSConnectionMockupMQ.TestMessageConsumer |
|
JMSConnectionMockupMQ.TestMessageProducer |
|
JMSConnectionMockupMQ.TestObjectMessage |
|
JMSConnectionMockupMQ.TestQueue |
|
JMSConnectionMockupMQ.TestSession |
|
JMSConnectionMockupMQ.TestTopic |
|
JMSConnectionSunMQ |
Handles the communication with a Sun JMS broker.
|
JMSConnectionTester |
Tests JMSConnection, the class that handles all JMS operations for Netarkivet.
|
JMSConnectionTester.DummyServer |
|
JMSMonitorRegistryClient |
The monitor registry client sends messages with JMS to register the host for JMX monitoring.
|
JMXProxy |
This tool will simply reregister all MBeans that matches the given query from the JMX hosts read in settings, using
its own platformmbeanserver.
|
JMXProxyConnection |
JMX interface for connection objects that can be used for accessing MBeans on remote servers.
|
JMXProxyConnectionFactory |
Common interface for objects that supply JMX connections to remote servers.
|
JMXStatusEntry |
Implementation of StatusEntry, that receives its data from the MBeanServer (JMX).
|
JMXSummaryUtils |
Various utility methods and classes for the JMX Monitor page.
|
JMXSummaryUtils.StarredRequest |
This class encapsulates a HttpServletRequest, making non-existing parameters appear as "*" for wildcard (or "0"
for the index parameter).
|
JMXUtils |
Various JMX-related utility functions.
|
Job |
This class represents one job to run by Heritrix.
|
JobDAO |
Interface for creating and accessing jobs in persistent storage.
|
JobDBDAO |
A database-based implementation of the JobDAO class.
|
JobDispatcher |
This class handles dispatching of Harvest jobs to the Harvesters.
|
JobEndedMessage |
This message is sent by the HarvestSchedulerMonitorServer to the HarvestMonitor to notify it that a
job ended and should not be monitored anymore, and that any resource used to monitor this job should be freed.
|
JobGenerator |
This interface defines the core methods that should be provided by a job generator.
|
JobGeneratorFactory |
Factory class for instantiating a specific implementation of JobGenerator .
|
JobIndexCache |
An interface to a cache of data for jobs.
|
JobInfo |
Interface for selecting partial job information necessary for constructing HeritrixFiles.
|
JobStatus |
Enumeration of the possible states (alt.: status) a Job can be in.
|
JobStatusInfo |
A simple tuple to deliver information on the status of jobs.
|
JobSupervisor |
|
JspWriterMockup |
JSP writer that simply writes to a public string writer.
|
KeyValuePair<K,V> |
A generic Map.Entry class, useful for returning key-value-like results.
|
LargeFileGZIPInputStream |
Subclass of GZIPInputstream, including a workaround to support >2GB files.
|
LegacyHarvestReport |
Class responsible for representing a domain harvest report from crawl logs created by Heritrix and presenting the
relevant information to clients.
|
LegacyNamingConvention |
Implements the standard way of prefixing archive files in Netarchivesuite.
|
LifeCycleComponent |
Takes care of the lifecycling of subcomponents(children).
|
LinuxMachine |
A LinuxMachine is the instance of the abstract machine class, which runs the operating system Linux or another Unix
dependent operation system.
|
LinuxMachine.osInstallScriptTpl |
|
LinuxMachine.osKillScriptTpl |
|
LinuxMachine.osStartScriptTpl |
|
LoadableFileBatchJob |
This implementation of FileBatchJob is a bridge to a class file given as a File object.
|
LoadableJarBatchJob |
This implementation of FileBatchJob is a bridge to a jar file given as a File object.
|
LoadDatabaseChecksumArchive |
Program for uploading data from the filebased FileChecksumArchive to a DatabaseChecksumArchive.
|
LocalArcRepositoryClient |
A simple implementation of ArcRepositoryClient that just has a number of local directories where it stores its files.
|
LogbackRecorder |
This class implements an Logback appender which can be attached dynamically to an
SLF4J context.
|
LogbackRecorder.DenyFilter |
Simple deny filter.
|
LoggingOutputStream |
OutputStream which can be used to redirect all stdout and stderr to a logger.
|
LoggingOutputStream.LoggingLevel |
Enum representing the standard logging levels for commons logging.
|
LRUCache |
An LRU cache, based on LinkedHashMap .
|
LuceneUtils |
Some Lucene Utilities used in some of our tests.
|
Machine |
Machine defines an abstract representation of a physical computer at a physical location.
|
MailValidator |
Reads the mail-settings and tries to send a mail-notification.
|
MasterTemplateBuilder |
Class to handle the generation of HTML using a template engine.
|
MBeanConnectorCreator |
Utility class that handles exposing the platform mbean server using rmi, and using specified ports and password
files.
|
MessageAsserts |
Assertions on JMS/Netarkivet messages
|
MetadataCDXExtractionStrategy |
Strategy to extract CDX lines from metadata files.
|
MetadataCDXMapper |
Hadoop Mapper for creating CDX indexes for metadata files through the GUI application's QA pages.
|
MetadataEntry |
Class used to carry metadata in DoOneCrawl messages, including the URL and mimetype necessary to write the metadata
to metadata (W)ARC files.
|
MetadataExtractionStrategy |
Strategy to give a HadoopJob when wanting to extract selected content from metadata files matching specific
URL- and MIME-patterns.
|
MetadataFile |
Wraps information for an Heritrix file that should be stored in the metadata ARC.
|
MetadataFileWriter |
Abstract base class for Metadata file writer.
|
MetadataFileWriterArc |
MetadataFileWriter that writes to ARC files.
|
MetadataFileWriterWarc |
MetadataFileWriter that writes to WARC files.
|
MetadataIndexingApplication |
|
MinuteFrequency |
Allows specification of a schedule with a frequency measured in minutes.
|
MissingURIRecorder |
This class handles recordings of URIs not found during URI lookup.
|
MockFreeSpaceProvider |
Mock Free Space Provider of the number of bytes free on the file system.
|
MockupJMS |
|
MonitorMessage |
Common base class for messages exchanged between an archive server and an archive client (or within an archive).
|
MonitorMessageHandler |
This default message handler shields of all unimplemented methods from the MonitorMessageVisitor interface.
|
MonitorMessageVisitor |
Interface for all classes which handles monitor-related messages received from a JMS server.
|
MonitorRegistry |
A registry of known JMX URLs.
|
MonitorRegistryClient |
Client for registering JMX monitoring at registry.
|
MonitorRegistryClientFactory |
A factory for MonitorRegistryClient.
|
MonitorRegistryServer |
The monitor registry server listens on JMS for hosts that wish to register themselves to the service.
|
MonitorSettings |
Provides access to monitor settings.
|
MonthlyFrequency |
This class implements a frequency of a number of months.
|
MoveTestFiles |
|
MultiFileBasedCache<T extends Comparable<T>> |
Implementation of file based cache, that works with the assumption we are working on a set if ids, of which we might
only get a subset correct.
|
MySQLSpecifics |
|
MySQLSpecifics |
MySQL-specific implementation of DB methods.
|
Named |
This interface describes objects that have a name.
|
NamedUtils |
Utilities for handling named objects.
|
NasCrawlMetadata |
NetarchiveSuite extension of the org.archive.modules.CrawlMetadata class.
|
NASEnvironment |
|
NASEnvironment.StringMatcher |
|
NASFetchDNS |
Extended FetchDNS processor which allows the override of hosts
to be used before they are querying through a DNS server.
|
NASSurtPrefixedDecideRule |
Extended SurtPrefixedDecideRule class.
|
NASUser |
|
NasWARCProcessor |
Custom NAS WARCWriterProcessor addding NetarchiveSuite metadata to the WARCInfo records written
by Heritrix by just extending the org.archive.modules.writer.WARCWriterProcessor;
This was not possible in H1.
|
NetarchiveCacheResourceStore |
This is the connector between netarchivesuite and wayback.
|
NetarchiveResourceStore |
This is the connector between netarchivesuite and wayback.
|
NetarkivetException |
Base exception for all Netarkivet exceptions.
|
NetarkivetMessage |
Common base class for all messages exchanged in the NetarchiveSuite.
|
NodeTraverser |
Provides functionality for generating xml document nodes ad-hoc for test purposes.
|
NonFunctionalArcRepositoryServer |
|
Notifications |
This class encapsulates reacting to a serious error or warning message.
|
NotificationsFactory |
Get a notifications handler for serious errors.
|
NotificationType |
Encapsulates the three different notification Types.
|
NotifyingURIResolver |
A wrapper class for URI resolver, which also notifies an URIObserver about all URIs visited and their response codes.
|
NotImplementedException |
An exception to throw when an unfinished function is called.
|
NullRemoteFile |
This is an implementation of RemoteFile which does nothing and can therefore be used in batch jobs for which no
output is required.
|
NumberUtils |
Number related utilities.
|
OnbFreeSpaceProvider |
Onb Free Space Provider returns the number of bytes free
Returning 0 if a given path is not writable or if free space is lower than a given minimum free space percentage
|
OnNSDomainsDecideRule |
Class that re-creates the SurtPrefixSet to include only domain names
according to the domain definition of NetarchiveSuite.
|
Pageable |
|
Pagination |
Builds a pagination HTML text block using twitter-bootstrap styles.
|
Parameters |
The Parameters class contains the machine parameters.
|
Parameters |
Definitions of parameters for the web interface.
|
PartialHarvest |
This class contains the specific properties and operations of harvest definitions which are not snapshot harvest
definitions.
|
Password |
Immutable password class.
|
PeriodicTaskExecutor |
|
PeriodicTaskExecutor.PeriodicTask |
Represents a periodic task.
|
PermissionDenied |
Access was denied to a resource or credentials were invalid.
|
PersistentJobData |
Class PersistentJobData holds information about an ongoing harvest.
|
PersistentJobData.XmlState |
Helper class for returning the OK-state back to the caller.
|
PersistentJobData.XmlState.OKSTATE |
enum for holding OK/NOTOK values.
|
PhysicalLocation |
The physical location class.
|
PostgreSQLSpecifics |
|
PostgreSQLSpecifics |
PostgreSQL-specific implementation of DB methods.
|
PostProcessing |
|
PrerequisiteIgnoringQuotaEnforcer |
A Heritrix QuotaEnforcer which never enforces quotas on prerequisite uris: dns, robots.txt, and credentials
|
PreservationArcRepositoryClient |
Implements the Facade pattern to shield off the methods in JMSArcRepositoryClient not to be used by the bit
preservation system.
|
PreservationState |
The interface for the preservations states used by the web applications.
|
PreserveStdStreams |
|
PreventSystemExit |
Configures the test environment to block calls to System.exit(), throwing a PermissionDenied instead.
|
PrintMonitorRegistryClient |
A trivial monitor registry client, that doesn't register anywhere, but simply reports where it might be monitored on
stdout.
|
PrintNotifications |
A notification implementation that prints notifications on System.err.
|
ProcessUtils |
Various utilities for running processes -- not exactly Java's forte.
|
PutFileAction |
Action class to put files to Bitmag.
|
PutFileEventHandler |
|
QASiteSection |
Site section that creates the menu for QA.
|
QueueController |
Helper class to test the status of the number of submitted jobs on our JMS Queues.
|
RawDataCache |
An interface for getting raw data out of the bitarchives based on job IDs.
|
RawMetadataCache |
This is an implementation of the RawDataCache specialized for data out of metadata files.
|
ReadOnlyAdminData |
Deprecated.
|
ReadOnlyByteArray |
Implements access to an array in a read-only fashion.
|
ReestablishAdminDatabase |
Method for reestablishing the admin database from a 'admin.data' file.
|
ReflectUtils |
Methods that help in doing common reflection tasks.
|
ReformatTranslationFile |
Program to reformat a a Translation file.
|
RegExpExclusionFilter |
|
RegExpExclusionFilterFactory |
This class allows one to specify a file containing a list of regular expressions specifying url's to be blocked from
access via wayback.
|
RegisterHostMessage |
This type of message is sent to the monitor registry server to register the host for remote JMX monitoring.
|
ReloadSettings |
|
RememberNotifications |
Mockup that simply remembers its last calls in two public fields.
|
RemoteFile |
RemoteFile: Interface for encapsulating remote files.
|
RemoteFileFactory |
Factory for creating remote files.
|
RemoteFileSettings |
Container for the RemoteFile settings used by one app, so they can be used by another app.
|
RemoveAndGetFileMessage |
Message requesting a file to be removed and returned from a bitarchive.
|
RepeatingSchedule |
This class implements a schedule that should run a certain number of times.
|
Replica |
This class encapsulates the bitarchive or checksum replicas.
|
ReplicaCacheDatabase |
Method for storing the bitpreservation cache in a database.
|
ReplicaCacheHelpers |
|
ReplicaClient |
Interface for the replica clients.
|
ReplicaClientFactory |
This contains a method for retrieving all the replica clients at once.
|
ReplicaFileInfo |
This is a container for the ReplicaFileInfo table in the bitpreservation database.
|
ReplicaStoreState |
This class encapsulates the different upload states, while storing a file in the archive of a replica .
|
ReplicaType |
Enumeration of the possible replica types used for replicas.
|
Reporting |
Methods for generating the batch results needed by the QA pages.
|
Request |
The Request interface is a very minimal version of a HTTP request.
|
RequestType |
Types of requests we can handle in an index server.
|
RequiresFileResolver |
Marker interface for integration tests that are dependent on the existence of a FileResolver in the environment
|
ResetFailedFiles |
Utility to enable retry of indexing for selected files after they have reached maxFailedAttempts.
|
ResourceAbstract |
|
ResourceManagerAbstract |
|
Response |
The Response interface is a very minimal version of a HTTP response.
|
ResultSetIterator<T> |
Similar to a FilterIterator, but takes a java.sql.ResultSet (which is neither Iterable, Iterator nor Enumeration).
|
ResultStream |
Simple helper class to store the fact, whether we have a stream which contains a header or a stream, which does not.
|
RetiredQueuesFilter |
|
RmiProxyConnectionFactory |
Creates RMI-based JMX connections to remote servers.
|
RunBatch |
A command-line tool to run batch jobs in the bitarchive.
|
RunningJobsInfoDAO |
Abstract class for handling the persistence of running job infos.
|
RunningJobsInfoDBDAO |
Class implementing the persistence of running job infos.
|
Schedule |
This class implements a schedule that can be either repeating or timed, depending on the subclass.
|
ScheduleDAO |
A DAO for reading and writing schedules by name.
|
ScheduleDBDAO |
A database-based implementation of the ScheduleDAO.
|
ScheduleDefinition |
Contains utility methods for creating and editing schedule definitions for harvests.
|
ScriptConstants |
This class contains constants and functions specific for creating the scripts and other files for the different
machines and applications.
|
SearchResult |
|
SeedList |
Representation of the list of harvesting seeds.
|
SeedUriDomainnameQueueAssignmentPolicy |
This is a modified version of the DomainnameQueueAssignmentPolicy
where domainname returned is the domainname of the candidateURI
except where the the SeedURI belongs to a different domain.
|
SelectiveHarvestUtil |
This class contains the methods for updating data for selective harvests.
|
SendDedupIndexRequestToIndexserver |
Program that sends a a IndexRequestMessage to the indexserver,
so that it starts generating a deduplication index for a snapshot harvest
Argument: File containing a set of numbers, one number per line.
|
Serial |
Created by IntelliJ IDEA.
|
SetSystemProperty |
This class allows setting a system property temporarily.
|
Settings |
Provides access to general application settings.
|
SettingsFactory<T> |
Generic class for creating class instances from class names given in settings.
|
ShutdownHook |
Defines a ShutdownHook for a class which has a cleanup method.
|
SimpleCDXRecordFilter |
A Simple CDXRecordFilter to be extended.
|
SimpleCmdlineTool |
A very abstracted interface for simple command line tools.
|
SimpleFileResolver |
A simple file resolver for resolving local files against a parent directory to get
Path objects representing these files
|
SimpleXml |
Utility class to load and save data from/to XML files using a very simple XML format.
|
SingleLogRecord |
An interface for reading the contents of a log record.
|
SingleMBeanObject<I> |
Wrapper class for registering objects of type I as MBeans.
|
SiteSection |
This class holds information about one section of the site, including information about what to put in the menu
sidebar and how to determine which page you're in.
|
SlowTest |
Marker interface to identify tests as taking a long time to run.
|
SnapshotHarvestDefinition |
Contains utility methods for supporting GUI for updating snapshot harvests.
|
SparseBitSet |
A sparse implementation of a BitSet, that does not require memory linear to the largest index.
|
SparseDomain |
Reduced version of the Domain class for presentation purposes.
|
SparseDomainConfiguration |
Sparse version for DomainConfiguration class.
|
SparseFullHarvest |
Sparse version of FullHarvest to be used for GUI purposes only.
|
SparsePartialHarvest |
Sparse version of PartialHarvest to be used for GUI purposes only.
|
StartedJobInfo |
This class is a simple bean storing information about a started job.
|
StartedJobInfo.Criteria |
list of the compare criteria.
|
StatusEntry |
An interface that specifies the information available in our JMX log mbeans.
|
StatusSiteSection |
Site section that creates the menu for system status.
|
StopReason |
Class for containing a reason for stopping the harvesting of a domain.
|
StoreMessage |
Messages requesting store of file.
|
StreamUtils |
Utilities for handling streams.
|
StringAsserts |
More complex asserts for strings
|
StringIndexFile |
|
StringRemoteFile |
A RemoteFile implementation that just takes a string.
|
StringTree<T> |
An interface defining a structure with nodes, subnodes and leaves.
|
StringUtils |
Utilities for working with strings.
|
Synchronizer |
Converts an asynchronous call to a synchronous call.
|
SystemUtils |
Miscellanous utilities for getting system resources.
|
TableSort |
Contains the data about how a table is sorted.
|
TableSort.SortOrder |
list of the sort order.
|
Template |
Simple template engine functions that replaces %{...} in an array of strings or a single string.
|
Template |
Simple template engine functions that replaces ${...} in an array of strings or a single string.
|
TemplateDAO |
DAO methods for reading templates only.
|
TemplateDBDAO |
Implements the TemplateDAO with databases.
|
TestArcRepositoryClient |
A local-file based arc repository client.
|
TestChecksumJob |
|
TestConfigurationIF |
This interface should be implemented by classes that encapsulate one particular aspect to be handled by setUp() and
tearDown() in many unit tests.
|
TestDBConnection |
A wrapper around another SQL connection that changes the following:
|
TestFileUtils |
File utilities specific to the test classes.
|
TestIndexRequestServer |
Index request server singleton.
|
TestInfo |
Some constants used by HTTP(S)RemoteFileTester classes and UseTestRemoteFile.
|
TestInfo |
|
TestInfo |
|
TestMessageListener |
A simple message listener that collects the messages given to it and lets you query them
|
TestRemoteFile |
Version of RemoteFile that reads/writes a file to local storage.
|
TestResourceUtils |
|
TestSiteSection |
A site section for test use.
|
ThreadUtils |
|
TimedSchedule |
This class implements a schedule that runs over a specified period of time.
|
TimeUnit |
Enumeration of the possible time units used for frequencies in schedules.
|
TimeUtils |
Various utilities for waiting some time.
|
TLD |
Encapsulate the reading of Top level domains from settings and the embedded public_suffix.dat file.
|
TLDInfo |
A container for miscellaneous information about a TLD.
|
ToeThread |
One "worker thread"; asks for CrawlURIs, processes them,
repeats unless told otherwise.
|
ToeThread.Step |
|
ToolRunnerBase |
A simple class that manages and runs an implementation of SimpleCmdlineTool.
|
TopTotalEnqueuesFilter |
Filters the N active queues (i.e.
|
TrapAction |
Abstract class representing an action to take on the collection of global crawler traps.
|
TrapActionEnum |
Represents the various actions which can be carried out to modify Global Crawler Traps.
|
TrapActivationAction |
Action class for changing the activation status of a crawler trap list.
|
TrapCreateOrUpdateAction |
This action processes multipart uploads to either create or update a global crawler trap list.
|
TrapDeleteAction |
Action class for deleting a global crawler trap list.
|
TrapReadAction |
Class to read and return a global crawler trap list to a web request.
|
TrivialArcRepositoryClient |
A minimal implementation of ArcRepositoryClient that just has one local directory that it keeps its files in, no
checking no nothing.
|
TrivialJobIndexCache |
A trivial JobIndexCache implementation that just assumes somebody places the indexes in the right place (in
TrivialJobIndexCache under the cache dir).
|
UnknownCommandResolver |
Wrapper for an URIResolver, which gives failures on specific specific URLs, and forwards all others to the wrapped
handler.
|
UnknownID |
Identifier could not be resolved.
|
UpdateableAdminData |
Deprecated.
|
Upload |
A tool to force upload of given arc or warc files into the ArcRepository found in settings.xml.
|
UploadMessage |
Container for upload request.
|
URIObserver |
Super class for all URIObservers - calls the URIObserver notify method on all notifications of a URI and its response
code.
|
URIResolver |
Interface for all classes that may resolve requests and update response with result.
|
URIResolverHandler |
Interface for classes that use an URI resolver.
|
UrlCanonicalizerFactory |
A factory for returning a UrlCanonicalizer.
|
UseTestRemoteFile |
A preconfigure class for using TestRemoteFile instead of FTP
|
ViewerArcRepositoryClient |
Implements the Facade pattern to shield off the methods in JMSArcRepositoryClient not to be used by the bit
preservation system.
|
ViewerProxy |
Singleton of a viewerproxy.
|
ViewerProxyApplication |
This class is used to start the ViewerProxy application.
|
WARCBatchFilter |
A filter class for batch entries.
|
WARCBatchJob |
Abstract class defining a batch job to run on a set of WARC files.
|
WARCExtractCDX |
Command line tool for extracting CDX information from given WARC files.
|
WARCExtractCDXJob |
Batch job that extracts information to create a CDX file.
|
WarcRecordClient |
|
WARCUtils |
Various utilities on WARC-records.
|
WaybackCDXExtractionARCBatchJob |
Returns a cdx file using the appropriate format for wayback, including canonicalisation of urls.
|
WaybackCDXExtractionWARCBatchJob |
Returns a cdx file using the appropriate format for wayback, including canonicalisation of urls.
|
WaybackIndexer |
The WaybackIndexer starts threads to find new files to be indexed and indexes them.
|
WaybackIndexerApplication |
The entry point for the wayback indexer.
|
WaybackSettings |
Settings specific to the wayback module of NetarchiveSuite.
|
WebinterfaceTestCase |
A TestCase subclass specifically tailored to test webinterface classes, primarily the classes in
dk.netarkivet.harvester.webinterface: HarvestStatusTester, EventHarvestTester, DomainDefinitionTester,
ScheduleDefinitionTester, SnapshotHarvestDefinitionTester but also
dk.netarkivet.archive.webinterface.BitpreserveFileStatusTester
|
WebinterfaceTestCase.TestPageContext |
|
WebinterfaceTestCase.TestServletRequest |
A dummy class implementing only the methods for getting parameters.
|
WebProxy |
The WebProxy is the ONLY viewerproxy class that interfaces with the Jetty classes.
|
WebProxy.HttpRequest |
A wrapper around the Jetty HttpRequest, giving the simple Request interface used in our URIResolvers.
|
WebProxy.HttpResponse |
A wrapper around the Jetty HttpResponse, giving the simple Response interface used in our URIResolvers.
|
WeeklyFrequency |
This class implements a frequency of a number of weeks.
|
WindowsMachine |
A WindowsMachine is the instance of the abstract machine class, which runs the operating system Windows.
|
WindowsMachine.windowsStartVbsScriptTpl |
|
WorkFiles |
This class encapsulates access to the files used in bitpreservation.
|
WriteBytesToFile |
A class with a method for creating large files.
|
XmlAsserts |
Helper methods for asserts in Xml documents.
|
XmlBuilder |
|
XmlStructure |
The structure for handling the XML files.
|
XmlTree<T> |
A class that implements the StringTree interface by backing it with XML.
|
XmlUtils |
Utilities for handling XML-files.
|
ZipUtils |
Utilities for interfacing with the (fairly low-level) java.util.zip package.
|