public class CrawlLogIterator extends CrawlDataIterator
CrawlDataIterator
capable of iterating over a Heritrix's style
crawl.log
.Modifier and Type | Field and Description |
---|---|
protected SimpleDateFormat |
crawlDataItemFormat
The date format specified by the
CrawlDataItem for dates entered into it (and eventually into the index) |
protected SimpleDateFormat |
crawlDateFormat
The date format used in crawl.log files.
|
protected BufferedReader |
in
A reader for the crawl.log file being processed
|
protected CrawlDataItem |
next
The next item to be issued (if ready) or null if the next item has not been prepared or there are no more
elements
|
Constructor and Description |
---|
CrawlLogIterator(String source)
Create a new CrawlLogIterator that reads items from a Heritrix crawl.log
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes the crawl.log file.
|
String |
getSourceType()
A short, human readable, string about what source this iterator uses.
|
boolean |
hasNext()
Returns true if there are more items available.
|
CrawlDataItem |
next()
Returns the next valid item from the crawl log.
|
protected CrawlDataItem |
parseLine(String line)
Parse the a line in the crawl log.
|
protected void |
prepareNext()
Ready the next item.
|
protected final SimpleDateFormat crawlDateFormat
protected final SimpleDateFormat crawlDataItemFormat
CrawlDataItem
for dates entered into it (and eventually into the index)protected BufferedReader in
protected CrawlDataItem next
public CrawlLogIterator(String source) throws IOException
source
- The path of a Heritrix crawl.log file.IOException
- If errors were found reading the log.public boolean hasNext() throws IOException
hasNext
in class CrawlDataIterator
IOException
- If an error occurs accessing the crawl data.public CrawlDataItem next() throws IOException
next
in class CrawlDataIterator
IOException
- If there is an error reading the item *after* the item to be returned from the crawl.log.NoSuchElementException
- If there are no more itemsprotected void prepareNext() throws IOException
Note: This method should only be called when next==null
IOException
protected CrawlDataItem parseLine(String line)
Override this method to change how individual crawl log items are processed and accepted/rejected. This method is called from within the loop in prepareNext().
line
- A line from the crawl log. Must not be null.CrawlDataItem
if the next line in the crawl log yielded a usable item, null otherwise.public void close() throws IOException
close
in class CrawlDataIterator
IOException
- If an error occurs closing access to crawl data.public String getSourceType()
CrawlDataIterator
getSourceType
in class CrawlDataIterator
Copyright © 2005–2016 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.