Class HttpDocInfo

  • All Implemented Interfaces:
    Serializable

    public class HttpDocInfo
    extends CrawlDocInfo
    A URL being crawled holding relevant crawl information.
    Author:
    Pascal Essiembre
    See Also:
    Serialized Form
    • Constructor Detail

      • HttpDocInfo

        public HttpDocInfo()
      • HttpDocInfo

        public HttpDocInfo​(String reference)
      • HttpDocInfo

        public HttpDocInfo​(String url,
                           int depth)
        Constructor.
        Parameters:
        url - URL being crawled
        depth - URL depth
      • HttpDocInfo

        public HttpDocInfo​(DocInfo docDetails)
        Copy constructor.
        Parameters:
        docDetails - document details to copy
    • Method Detail

      • getEtag

        public String getEtag()
        Gets the HTTP ETag.
        Returns:
        etag
        Since:
        3.0.0
      • setEtag

        public void setEtag​(String etag)
        Sets the HTTP ETag.
        Parameters:
        etag - the ETag
        Since:
        3.0.0
      • getOriginalReference

        public String getOriginalReference()
      • setOriginalReference

        public void setOriginalReference​(String originalReference)
      • getDepth

        public int getDepth()
        Gets the URL depth.
        Returns:
        URL depth
      • getSitemapLastMod

        public ZonedDateTime getSitemapLastMod()
        Gets the sitemap last modified date.
        Returns:
        last modified date
      • setSitemapLastMod

        public void setSitemapLastMod​(ZonedDateTime sitemapLastMod)
        Sets the sitemap last modified date.
        Parameters:
        sitemapLastMod - last modified date
      • getSitemapChangeFreq

        public String getSitemapChangeFreq()
        Gets the sitemap change frequency.
        Returns:
        sitemap change frequency
      • setSitemapChangeFreq

        public void setSitemapChangeFreq​(String sitemapChangeFreq)
        Sets the sitemap change frequency.
        Parameters:
        sitemapChangeFreq - sitemap change frequency
      • getSitemapPriority

        public Float getSitemapPriority()
        Gets the sitemap priority.
        Returns:
        sitemap priority
      • setSitemapPriority

        public void setSitemapPriority​(Float sitemapPriority)
        Sets the sitemap priority.
        Parameters:
        sitemapPriority - sitemap priority
      • setDepth

        public final void setDepth​(int depth)
        Sets the URL depth.
        Parameters:
        depth - URL depth
      • getReferrerReference

        public String getReferrerReference()
      • setReferrerReference

        public void setReferrerReference​(String referrerReference)
      • getReferrerLinkMetadata

        public String getReferrerLinkMetadata()
      • setReferrerLinkMetadata

        public void setReferrerLinkMetadata​(String referrerLinkMetadata)
      • getUrlRoot

        public String getUrlRoot()
        Gets the URL root (protocol + domain, e.g. http://www.host.com).
        Returns:
        URL root
      • getReferencedUrls

        public List<String> getReferencedUrls()
        Gets URLs referenced by this one.
        Returns:
        URLs referenced by this one (never null).
        Since:
        2.6.0
      • setReferencedUrls

        public void setReferencedUrls​(List<String> referencedUrls)
        Sets URLs referenced by this one.
        Parameters:
        referencedUrls - referenced URLs
        Since:
        3.0.0
      • getRedirectTrail

        public List<String> getRedirectTrail()
        Gets the trail of URLs that were redirected up to this one.
        Returns:
        URL redirection trail to this one (never null).
        Since:
        2.8.0
      • setRedirectTrail

        public void setRedirectTrail​(List<String> redirectTrail)
        Sets the trail of URLs that were redirected up to this one.
        Parameters:
        redirectTrail - URL redirection trail to this one
        Since:
        3.0.0
      • addRedirectURL

        public void addRedirectURL​(String url)
        Adds a redirect URL to the trail of URLs that were redirected so far.
        Parameters:
        url - URL to add
        Since:
        3.0.0