Package com.norconex.collector.http.doc
Class HttpDocInfo
- java.lang.Object
-
- com.norconex.importer.doc.DocInfo
-
- com.norconex.collector.core.doc.CrawlDocInfo
-
- com.norconex.collector.http.doc.HttpDocInfo
-
- All Implemented Interfaces:
Serializable
public class HttpDocInfo extends CrawlDocInfo
A URL being crawled holding relevant crawl information.- Author:
- Pascal Essiembre
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.norconex.collector.core.doc.CrawlDocInfo
CrawlDocInfo.Stage
-
-
Constructor Summary
Constructors Constructor Description HttpDocInfo()HttpDocInfo(DocInfo docDetails)Copy constructor.HttpDocInfo(String reference)HttpDocInfo(String url, int depth)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddRedirectToTrail(String url)Adds a redirect URL to the trail of URLs that were redirected so far.booleanequals(Object other)intgetDepth()Gets the URL depth.StringgetEtag()Gets the HTTP ETag.StringgetOriginalReference()StringgetRedirectTarget()Gets the immediate target of a redirect.List<String>getRedirectTrail()Gets the trail of URLs that were redirected up to this one.List<String>getReferencedUrls()Gets URLs referenced by this one.StringgetReferrerLinkMetadata()StringgetReferrerReference()StringgetSitemapChangeFreq()Gets the sitemap change frequency.ZonedDateTimegetSitemapLastMod()Gets the sitemap last modified date.FloatgetSitemapPriority()Gets the sitemap priority.StringgetUrlRoot()Gets the URL root (protocol + domain, e.g. http://www.host.com).inthashCode()voidsetDepth(int depth)Sets the URL depth.voidsetEtag(String etag)Sets the HTTP ETag.voidsetOriginalReference(String originalReference)voidsetRedirectTarget(String redirectTarget)Sets the immediate target of a redirect.voidsetRedirectTrail(List<String> redirectTrail)Sets the trail of URLs that were redirected up to this one.voidsetReference(String url)voidsetReferencedUrls(List<String> referencedUrls)Sets URLs referenced by this one.voidsetReferrerLinkMetadata(String referrerLinkMetadata)voidsetReferrerReference(String referrerReference)voidsetSitemapChangeFreq(String sitemapChangeFreq)Sets the sitemap change frequency.voidsetSitemapLastMod(ZonedDateTime sitemapLastMod)Sets the sitemap last modified date.voidsetSitemapPriority(Float sitemapPriority)Sets the sitemap priority.StringtoString()-
Methods inherited from class com.norconex.collector.core.doc.CrawlDocInfo
getContentChecksum, getCrawlDate, getMetaChecksum, getParentRootReference, getState, setContentChecksum, setCrawlDate, setMetaChecksum, setParentRootReference, setState
-
Methods inherited from class com.norconex.importer.doc.DocInfo
addEmbeddedParentReference, copyFrom, copyTo, getContentEncoding, getContentType, getEmbeddedParentReferences, getReference, setContentEncoding, setContentType, setEmbeddedParentReferences
-
-
-
-
Constructor Detail
-
HttpDocInfo
public HttpDocInfo()
-
HttpDocInfo
public HttpDocInfo(String reference)
-
HttpDocInfo
public HttpDocInfo(String url, int depth)
Constructor.- Parameters:
url- URL being crawleddepth- URL depth
-
HttpDocInfo
public HttpDocInfo(DocInfo docDetails)
Copy constructor.- Parameters:
docDetails- document details to copy
-
-
Method Detail
-
getEtag
public String getEtag()
Gets the HTTP ETag.- Returns:
- etag
- Since:
- 3.0.0
-
setEtag
public void setEtag(String etag)
Sets the HTTP ETag.- Parameters:
etag- the ETag- Since:
- 3.0.0
-
getOriginalReference
public String getOriginalReference()
-
setOriginalReference
public void setOriginalReference(String originalReference)
-
getDepth
public int getDepth()
Gets the URL depth.- Returns:
- URL depth
-
getSitemapLastMod
public ZonedDateTime getSitemapLastMod()
Gets the sitemap last modified date.- Returns:
- last modified date
-
setSitemapLastMod
public void setSitemapLastMod(ZonedDateTime sitemapLastMod)
Sets the sitemap last modified date.- Parameters:
sitemapLastMod- last modified date
-
getSitemapChangeFreq
public String getSitemapChangeFreq()
Gets the sitemap change frequency.- Returns:
- sitemap change frequency
-
setSitemapChangeFreq
public void setSitemapChangeFreq(String sitemapChangeFreq)
Sets the sitemap change frequency.- Parameters:
sitemapChangeFreq- sitemap change frequency
-
getSitemapPriority
public Float getSitemapPriority()
Gets the sitemap priority.- Returns:
- sitemap priority
-
setSitemapPriority
public void setSitemapPriority(Float sitemapPriority)
Sets the sitemap priority.- Parameters:
sitemapPriority- sitemap priority
-
setDepth
public final void setDepth(int depth)
Sets the URL depth.- Parameters:
depth- URL depth
-
getReferrerReference
public String getReferrerReference()
-
setReferrerReference
public void setReferrerReference(String referrerReference)
-
getReferrerLinkMetadata
public String getReferrerLinkMetadata()
-
setReferrerLinkMetadata
public void setReferrerLinkMetadata(String referrerLinkMetadata)
-
setReference
public final void setReference(String url)
- Overrides:
setReferencein classDocInfo
-
getUrlRoot
public String getUrlRoot()
Gets the URL root (protocol + domain, e.g. http://www.host.com).- Returns:
- URL root
-
getReferencedUrls
public List<String> getReferencedUrls()
Gets URLs referenced by this one.- Returns:
- URLs referenced by this one (never
null). - Since:
- 2.6.0
-
setReferencedUrls
public void setReferencedUrls(List<String> referencedUrls)
Sets URLs referenced by this one.- Parameters:
referencedUrls- referenced URLs- Since:
- 3.0.0
-
getRedirectTrail
public List<String> getRedirectTrail()
Gets the trail of URLs that were redirected up to this one.- Returns:
- URL redirection trail to this one (never
null). - Since:
- 2.8.0
-
setRedirectTrail
public void setRedirectTrail(List<String> redirectTrail)
Sets the trail of URLs that were redirected up to this one.- Parameters:
redirectTrail- URL redirection trail to this one- Since:
- 3.0.0
-
addRedirectToTrail
public void addRedirectToTrail(String url)
Adds a redirect URL to the trail of URLs that were redirected so far.- Parameters:
url- URL to add- Since:
- 3.0.0
-
getRedirectTarget
public String getRedirectTarget()
Gets the immediate target of a redirect.- Returns:
- redirect target or
null - Since:
- 3.1.0
-
setRedirectTarget
public void setRedirectTarget(String redirectTarget)
Sets the immediate target of a redirect.- Parameters:
redirectTarget- redirect target- Since:
- 3.1.0
-
equals
public boolean equals(Object other)
- Overrides:
equalsin classCrawlDocInfo
-
hashCode
public int hashCode()
- Overrides:
hashCodein classCrawlDocInfo
-
toString
public String toString()
- Overrides:
toStringin classCrawlDocInfo
-
-