Class HttpDocInfo

All Implemented Interfaces:
Serializable

public class HttpDocInfo extends CrawlDocInfo
A URL being crawled holding relevant crawl information.
Author:
Pascal Essiembre
See Also:
  • Constructor Details

    • HttpDocInfo

      public HttpDocInfo()
    • HttpDocInfo

      public HttpDocInfo(String reference)
    • HttpDocInfo

      public HttpDocInfo(String url, int depth)
      Constructor.
      Parameters:
      url - URL being crawled
      depth - URL depth
    • HttpDocInfo

      public HttpDocInfo(DocInfo docDetails)
      Copy constructor.
      Parameters:
      docDetails - document details to copy
  • Method Details

    • getEtag

      public String getEtag()
      Gets the HTTP ETag.
      Returns:
      etag
      Since:
      3.0.0
    • setEtag

      public void setEtag(String etag)
      Sets the HTTP ETag.
      Parameters:
      etag - the ETag
      Since:
      3.0.0
    • getOriginalReference

      public String getOriginalReference()
    • setOriginalReference

      public void setOriginalReference(String originalReference)
    • getDepth

      public int getDepth()
      Gets the URL depth.
      Returns:
      URL depth
    • getSitemapLastMod

      public ZonedDateTime getSitemapLastMod()
      Gets the sitemap last modified date.
      Returns:
      last modified date
    • setSitemapLastMod

      public void setSitemapLastMod(ZonedDateTime sitemapLastMod)
      Sets the sitemap last modified date.
      Parameters:
      sitemapLastMod - last modified date
    • getSitemapChangeFreq

      public String getSitemapChangeFreq()
      Gets the sitemap change frequency.
      Returns:
      sitemap change frequency
    • setSitemapChangeFreq

      public void setSitemapChangeFreq(String sitemapChangeFreq)
      Sets the sitemap change frequency.
      Parameters:
      sitemapChangeFreq - sitemap change frequency
    • getSitemapPriority

      public Float getSitemapPriority()
      Gets the sitemap priority.
      Returns:
      sitemap priority
    • setSitemapPriority

      public void setSitemapPriority(Float sitemapPriority)
      Sets the sitemap priority.
      Parameters:
      sitemapPriority - sitemap priority
    • setDepth

      public final void setDepth(int depth)
      Sets the URL depth.
      Parameters:
      depth - URL depth
    • getReferrerReference

      public String getReferrerReference()
    • setReferrerReference

      public void setReferrerReference(String referrerReference)
    • getReferrerLinkMetadata

      public String getReferrerLinkMetadata()
    • setReferrerLinkMetadata

      public void setReferrerLinkMetadata(String referrerLinkMetadata)
    • setReference

      public final void setReference(String url)
      Overrides:
      setReference in class DocInfo
    • getUrlRoot

      public String getUrlRoot()
      Gets the URL root (protocol + domain, e.g. http://www.host.com).
      Returns:
      URL root
    • getReferencedUrls

      public List<String> getReferencedUrls()
      Gets URLs referenced by this one.
      Returns:
      URLs referenced by this one (never null).
      Since:
      2.6.0
    • setReferencedUrls

      public void setReferencedUrls(List<String> referencedUrls)
      Sets URLs referenced by this one.
      Parameters:
      referencedUrls - referenced URLs
      Since:
      3.0.0
    • getRedirectTrail

      public List<String> getRedirectTrail()
      Gets the trail of URLs that were redirected up to this one.
      Returns:
      URL redirection trail to this one (never null).
      Since:
      2.8.0
    • setRedirectTrail

      public void setRedirectTrail(List<String> redirectTrail)
      Sets the trail of URLs that were redirected up to this one.
      Parameters:
      redirectTrail - URL redirection trail to this one
      Since:
      3.0.0
    • addRedirectToTrail

      public void addRedirectToTrail(String url)
      Adds a redirect URL to the trail of URLs that were redirected so far.
      Parameters:
      url - URL to add
      Since:
      3.0.0
    • getRedirectTarget

      public String getRedirectTarget()
      Gets the immediate target of a redirect.
      Returns:
      redirect target or null
      Since:
      3.1.0
    • setRedirectTarget

      public void setRedirectTarget(String redirectTarget)
      Sets the immediate target of a redirect.
      Parameters:
      redirectTarget - redirect target
      Since:
      3.1.0
    • equals

      public boolean equals(Object other)
      Overrides:
      equals in class CrawlDocInfo
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class CrawlDocInfo
    • toString

      public String toString()
      Overrides:
      toString in class CrawlDocInfo