Interface IDelayResolver

  • All Known Implementing Classes:
    AbstractDelayResolver, GenericDelayResolver, ReferenceDelayResolver

    public interface IDelayResolver
    Resolves and creates intentional "delays" to increase document download time intervals. This interface does not dictate how delays are resolved. It is left to implementors to put in place their own strategy (e.g. pause all threads, delay multiple crawls on the same website domain only, etc). Try to be "nice" to the web sites you crawl.
    Author:
    Pascal Essiembre
    • Method Detail

      • delay

        void delay​(RobotsTxt robotsTxt,
                   String url)
        Delay crawling activities (if applicable).
        Parameters:
        robotsTxt - robots.txt instance (if applicable)
        url - the URL being crawled