Class StandardRobotsTxtProvider

  • All Implemented Interfaces:
    IRobotsTxtProvider

    public class StandardRobotsTxtProvider
    extends Object
    implements IRobotsTxtProvider

    Implementation of IRobotsTxtProvider as per the robots.txt standard described at http://www.robotstxt.org/robotstxt.html.

    XML configuration usage:

    
    <robotsTxt
        ignore="false"
        class="com.norconex.collector.http.robot.impl.StandardRobotsTxtProvider"/>

    XML usage example:

    
    <pre>
      <robotsTxt
          ignore="true"/>

    The above example ignores "robots.txt" files present on web sites.

    Author:
    Pascal Essiembre