Package com.norconex.collector.http.doc
Class HttpDocInfo
- java.lang.Object
-
- com.norconex.importer.doc.DocInfo
-
- com.norconex.collector.core.doc.CrawlDocInfo
-
- com.norconex.collector.http.doc.HttpDocInfo
-
- All Implemented Interfaces:
Serializable
public class HttpDocInfo extends CrawlDocInfo
A URL being crawled holding relevant crawl information.- Author:
- Pascal Essiembre
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.norconex.collector.core.doc.CrawlDocInfo
CrawlDocInfo.Stage
-
-
Constructor Summary
Constructors Constructor Description HttpDocInfo()
HttpDocInfo(DocInfo docDetails)
Copy constructor.HttpDocInfo(String reference)
HttpDocInfo(String url, int depth)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addRedirectURL(String url)
Adds a redirect URL to the trail of URLs that were redirected so far.boolean
equals(Object other)
int
getDepth()
Gets the URL depth.String
getEtag()
Gets the HTTP ETag.String
getOriginalReference()
List<String>
getRedirectTrail()
Gets the trail of URLs that were redirected up to this one.List<String>
getReferencedUrls()
Gets URLs referenced by this one.String
getReferrerLinkMetadata()
String
getReferrerReference()
String
getSitemapChangeFreq()
Gets the sitemap change frequency.ZonedDateTime
getSitemapLastMod()
Gets the sitemap last modified date.Float
getSitemapPriority()
Gets the sitemap priority.String
getUrlRoot()
Gets the URL root (protocol + domain, e.g. http://www.host.com).int
hashCode()
void
setDepth(int depth)
Sets the URL depth.void
setEtag(String etag)
Sets the HTTP ETag.void
setOriginalReference(String originalReference)
void
setRedirectTrail(List<String> redirectTrail)
Sets the trail of URLs that were redirected up to this one.void
setReference(String url)
void
setReferencedUrls(List<String> referencedUrls)
Sets URLs referenced by this one.void
setReferrerLinkMetadata(String referrerLinkMetadata)
void
setReferrerReference(String referrerReference)
void
setSitemapChangeFreq(String sitemapChangeFreq)
Sets the sitemap change frequency.void
setSitemapLastMod(ZonedDateTime sitemapLastMod)
Sets the sitemap last modified date.void
setSitemapPriority(Float sitemapPriority)
Sets the sitemap priority.String
toString()
-
Methods inherited from class com.norconex.collector.core.doc.CrawlDocInfo
getContentChecksum, getCrawlDate, getMetaChecksum, getParentRootReference, getState, setContentChecksum, setCrawlDate, setMetaChecksum, setParentRootReference, setState
-
Methods inherited from class com.norconex.importer.doc.DocInfo
addEmbeddedParentReference, copyFrom, copyTo, getContentEncoding, getContentType, getEmbeddedParentReferences, getReference, setContentEncoding, setContentType, setEmbeddedParentReferences
-
-
-
-
Constructor Detail
-
HttpDocInfo
public HttpDocInfo()
-
HttpDocInfo
public HttpDocInfo(String reference)
-
HttpDocInfo
public HttpDocInfo(String url, int depth)
Constructor.- Parameters:
url
- URL being crawleddepth
- URL depth
-
HttpDocInfo
public HttpDocInfo(DocInfo docDetails)
Copy constructor.- Parameters:
docDetails
- document details to copy
-
-
Method Detail
-
getEtag
public String getEtag()
Gets the HTTP ETag.- Returns:
- etag
- Since:
- 3.0.0
-
setEtag
public void setEtag(String etag)
Sets the HTTP ETag.- Parameters:
etag
- the ETag- Since:
- 3.0.0
-
getOriginalReference
public String getOriginalReference()
-
setOriginalReference
public void setOriginalReference(String originalReference)
-
getDepth
public int getDepth()
Gets the URL depth.- Returns:
- URL depth
-
getSitemapLastMod
public ZonedDateTime getSitemapLastMod()
Gets the sitemap last modified date.- Returns:
- last modified date
-
setSitemapLastMod
public void setSitemapLastMod(ZonedDateTime sitemapLastMod)
Sets the sitemap last modified date.- Parameters:
sitemapLastMod
- last modified date
-
getSitemapChangeFreq
public String getSitemapChangeFreq()
Gets the sitemap change frequency.- Returns:
- sitemap change frequency
-
setSitemapChangeFreq
public void setSitemapChangeFreq(String sitemapChangeFreq)
Sets the sitemap change frequency.- Parameters:
sitemapChangeFreq
- sitemap change frequency
-
getSitemapPriority
public Float getSitemapPriority()
Gets the sitemap priority.- Returns:
- sitemap priority
-
setSitemapPriority
public void setSitemapPriority(Float sitemapPriority)
Sets the sitemap priority.- Parameters:
sitemapPriority
- sitemap priority
-
setDepth
public final void setDepth(int depth)
Sets the URL depth.- Parameters:
depth
- URL depth
-
getReferrerReference
public String getReferrerReference()
-
setReferrerReference
public void setReferrerReference(String referrerReference)
-
getReferrerLinkMetadata
public String getReferrerLinkMetadata()
-
setReferrerLinkMetadata
public void setReferrerLinkMetadata(String referrerLinkMetadata)
-
setReference
public final void setReference(String url)
- Overrides:
setReference
in classDocInfo
-
getUrlRoot
public String getUrlRoot()
Gets the URL root (protocol + domain, e.g. http://www.host.com).- Returns:
- URL root
-
getReferencedUrls
public List<String> getReferencedUrls()
Gets URLs referenced by this one.- Returns:
- URLs referenced by this one (never
null
). - Since:
- 2.6.0
-
setReferencedUrls
public void setReferencedUrls(List<String> referencedUrls)
Sets URLs referenced by this one.- Parameters:
referencedUrls
- referenced URLs- Since:
- 3.0.0
-
getRedirectTrail
public List<String> getRedirectTrail()
Gets the trail of URLs that were redirected up to this one.- Returns:
- URL redirection trail to this one (never
null
). - Since:
- 2.8.0
-
setRedirectTrail
public void setRedirectTrail(List<String> redirectTrail)
Sets the trail of URLs that were redirected up to this one.- Parameters:
redirectTrail
- URL redirection trail to this one- Since:
- 3.0.0
-
addRedirectURL
public void addRedirectURL(String url)
Adds a redirect URL to the trail of URLs that were redirected so far.- Parameters:
url
- URL to add- Since:
- 3.0.0
-
equals
public boolean equals(Object other)
- Overrides:
equals
in classCrawlDocInfo
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classCrawlDocInfo
-
toString
public String toString()
- Overrides:
toString
in classCrawlDocInfo
-
-