Class Heritrix3Files
- java.lang.Object
-
- dk.netarkivet.harvester.heritrix3.Heritrix3Files
-
public class Heritrix3Files extends Object
This class encapsulates the information generated by Heritrix3 or delivered to Heritrix3 before a crawl.
-
-
Method Summary
-
-
-
Method Detail
-
getH3HeritrixFiles
public static Heritrix3Files getH3HeritrixFiles(File crawldir, PersistentJobData harvestInfo)
-
getH3HeritrixFiles
public static Heritrix3Files getH3HeritrixFiles(File crawldir, Job job)
-
getCrawlDir
public File getCrawlDir()
-
writeSeedsTxt
public void writeSeedsTxt(String seedListAsString)
-
getSeedsFile
public File getSeedsFile()
-
getOrderFile
public File getOrderFile()
-
setIndexDir
public void setIndexDir(File indexDir)
-
writeOrderXml
public void writeOrderXml(HeritrixTemplate orderXMLdoc)
-
getProgressStatisticsLog
public File getProgressStatisticsLog()
-
getJobID
public Long getJobID()
-
getOrderXmlFile
public File getOrderXmlFile()
-
getSeedsTxtFile
public File getSeedsTxtFile()
-
getHarvestID
public Long getHarvestID()
-
getArchiveFilePrefix
public String getArchiveFilePrefix()
-
getIndexDir
public File getIndexDir()
-
getCrawlLog
public File getCrawlLog()
-
getHeritrixZip
public File getHeritrixZip()
-
getCertificateFile
public File getCertificateFile()
-
getHeritrixOutput
public File getHeritrixOutput()
-
getHeritrixStderrLog
public File getHeritrixStderrLog()
-
getHeritrixStdoutLog
public File getHeritrixStdoutLog()
-
getHeritrixJobDir
public File getHeritrixJobDir()
-
getHeritrixBaseDir
public File getHeritrixBaseDir()
-
getJobname
public String getJobname()
-
deleteFinalLogs
public void deleteFinalLogs()
-
cleanUpAfterHarvest
public void cleanUpAfterHarvest(File oldJobsDir)
-
getDisposableFiles
public File[] getDisposableFiles()
Considered as disposable files are the following: crawlDir/checkpoints h3JobDir/state h3JobDir/scratch h3BaseDir/bin h3BaseDir/extras h3BaseDir/lib- Returns:
- list of disposable files and file directories
-
-