java.lang.Object
- com.norconex.collector.http.crawler.URLCrawlScopeStrategy

```
public class URLCrawlScopeStrategy
extends Object
```
By default a crawler will try to follow all links it discovers. You can define your own filters to limit the scope of the pages being crawled. When you have multiple URLs defined as start URLs, it can be tricky to perform global filtering that apply to each URLs without causing URL filtering conflicts. This class offers an easy way to address a frequent URL filtering need: to "stay on site". That is, when following a page and extracting URLs found in it, make sure to only keep URLs that are on the same site as the page URL we are on.

By default this class does not request to stay on a site.

Since:

2.3.0

Author:

Pascal Essiembre

Constructor Summary

Constructors
Constructor Description

URLCrawlScopeStrategy()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`boolean`	`equals(Object other)`
`int`	`hashCode()`
`boolean`	`isIncludeSubdomains()`	Gets whether sub-domains are considered to be the same as a URL domain.
`boolean`	`isInScope(String inScopeURL, String candidateURL)`
`boolean`	`isStayOnDomain()`	Whether the crawler should always stay on the same domain name as the domain for each URL specified as a start URL.
`boolean`	`isStayOnPort()`	Gets whether the crawler should always stay on the same port as the port for each URL specified as a start URL.
`boolean`	`isStayOnProtocol()`	Whether the crawler should always stay on the same protocol as the protocol for each URL specified as a start URL.
`void`	`setIncludeSubdomains(boolean includeSubdomains)`	Sets whether sub-domains are considered to be the same as a URL domain.
`void`	`setStayOnDomain(boolean stayOnDomain)`	Sets whether the crawler should always stay on the same domain name as the domain for each URL specified as a start URL.
`void`	`setStayOnPort(boolean stayOnPort)`	Sets whether the crawler should always stay on the same port as the port for each URL specified as a start URL.
`void`	`setStayOnProtocol(boolean stayOnProtocol)`	Sets whether the crawler should always stay on the same protocol as the protocol for each URL specified as a start URL.
`String`	`toString()`

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - URLCrawlScopeStrategy
```
public URLCrawlScopeStrategy()
```
- Method Detail
  - isStayOnDomain
```
public boolean isStayOnDomain()
```
    Whether the crawler should always stay on the same domain name as the domain for each URL specified as a start URL. By default (false) the crawler will try follow any discovered links not otherwise rejected by other settings (like regular filtering rules you may have).
    
    Returns:
    
    true if the crawler should stay on a domain
  - setStayOnDomain
```
public void setStayOnDomain(boolean stayOnDomain)
```
    Sets whether the crawler should always stay on the same domain name as the domain for each URL specified as a start URL.
    
    Parameters:
    
    stayOnDomain - true for the crawler to stay on domain
  - isIncludeSubdomains
```
public boolean isIncludeSubdomains()
```
    Gets whether sub-domains are considered to be the same as a URL domain. Only applicable when "stayOnDomain" is true.
    
    Returns:
    
    true if including sub-domains
    
    Since:
    
    2.9.0
  - setIncludeSubdomains
```
public void setIncludeSubdomains(boolean includeSubdomains)
```
    Sets whether sub-domains are considered to be the same as a URL domain. Only applicable when "stayOnDomain" is true.
    
    Parameters:
    
    includeSubdomains - true to include sub-domains
    
    Since:
    
    2.9.0
  - isStayOnPort
```
public boolean isStayOnPort()
```
    Gets whether the crawler should always stay on the same port as the port for each URL specified as a start URL. By default (false) the crawler will try follow any discovered links not otherwise rejected by other settings (like regular filtering rules you may have).
    
    Returns:
    
    true if the crawler should stay on a port
  - setStayOnPort
```
public void setStayOnPort(boolean stayOnPort)
```
    Sets whether the crawler should always stay on the same port as the port for each URL specified as a start URL.
    
    Parameters:
    
    stayOnPort - true for the crawler to stay on port
  - isStayOnProtocol
```
public boolean isStayOnProtocol()
```
    Whether the crawler should always stay on the same protocol as the protocol for each URL specified as a start URL. By default (false) the crawler will try follow any discovered links not otherwise rejected by other settings (like regular filtering rules you may have).
    
    Returns:
    
    true if the crawler should stay on protocol
  - setStayOnProtocol
```
public void setStayOnProtocol(boolean stayOnProtocol)
```
    Sets whether the crawler should always stay on the same protocol as the protocol for each URL specified as a start URL.
    
    Parameters:
    
    stayOnProtocol - true for the crawler to stay on protocol
  - isInScope
```
public boolean isInScope(String inScopeURL,
                         String candidateURL)
```
  - equals
```
public boolean equals(Object other)
```
    Overrides:
    
    equals in class Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class Object
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object

Class URLCrawlScopeStrategy

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

URLCrawlScopeStrategy

Method Detail

isStayOnDomain

setStayOnDomain

isIncludeSubdomains

setIncludeSubdomains

isStayOnPort

setStayOnPort

isStayOnProtocol

setStayOnProtocol

isInScope

equals

hashCode

toString