public class CrawlDataItem extends Object
Modifier and Type | Field and Description |
---|---|
protected String |
contentDigest |
static String |
dateFormat
The proper formating of
setURL(String) and getURL() |
protected boolean |
duplicate |
protected String |
etag |
protected String |
mimetype |
protected String |
origin |
protected String |
timestamp |
protected String |
URL |
Constructor and Description |
---|
CrawlDataItem()
Constructor.
|
CrawlDataItem(String URL,
String contentDigest,
String timestamp,
String etag,
String mimetype,
String origin,
boolean duplicate)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
String |
getContentDigest()
Returns the documents content digest
|
String |
getEtag()
Returns the etag that was associated with the document.
|
String |
getMimeType()
Returns the mimetype that was associated with the document.
|
String |
getOrigin()
Returns the "origin" that was associated with the document.
|
String |
getTimestamp()
Returns a timestamp for when the URL was fetched in the format: yyyyMMddHHmmssSSS
|
String |
getURL()
Returns the URL
|
boolean |
isDuplicate()
Returns whether the CrawlDataItem was marked as duplicate.
|
void |
setContentDigest(String contentDigest)
Set the content digest
|
void |
setDuplicate(boolean duplicate)
Set whether duplicate or not.
|
void |
setEtag(String etag)
Set a new Etag
|
void |
setMimeType(String mimetype)
Set new MIME type.
|
void |
setOrigin(String origin)
Set new origin
|
void |
setTimestamp(String timestamp)
Set a new timestamp.
|
void |
setURL(String URL)
Set the URL
|
public static final String dateFormat
setURL(String)
and getURL()
protected String contentDigest
protected boolean duplicate
public CrawlDataItem()
public CrawlDataItem(String URL, String contentDigest, String timestamp, String etag, String mimetype, String origin, boolean duplicate)
URL
- The URL for this CrawlDataItemcontentDigest
- A content digest of the document found at the URLtimestamp
- Date of when the content digest was valid for that URL. Format: yyyyMMddHHmmssSSSetag
- Etag for the URLmimetype
- MIME type of the document found at the URLorigin
- The origin of the CrawlDataItem (the exact meaning of the origin is outside the scope of this class
and it may be any String value)duplicate
- True if this CrawlDataItem was marked as duplicatepublic String getContentDigest()
public void setContentDigest(String contentDigest)
contentDigest
- The new value of the content digestpublic String getTimestamp()
public void setTimestamp(String timestamp)
timestamp
- The new timestamp. It should be in the format: yyyyMMddHHmmssSSSpublic String getEtag()
If etag is unavailable null will be returned.
public String getMimeType()
public void setMimeType(String mimetype)
mimetype
- The new MIME typepublic String getOrigin()
public boolean isDuplicate()
public void setDuplicate(boolean duplicate)
duplicate
- true if duplicate, false otherwiseCopyright © 2005–2015 The Royal Danish Library, the Danish State and University Library, the National Library of France and the Austrian National Library.. All rights reserved.