Package | Description |
---|---|
com.norconex.collector.http.crawler | |
com.norconex.collector.http.url.impl |
Modifier and Type | Method and Description |
---|---|
ILinkExtractor[] |
HttpCrawlerConfig.getLinkExtractors() |
Modifier and Type | Method and Description |
---|---|
void |
HttpCrawlerConfig.setLinkExtractors(ILinkExtractor... linkExtractors) |
Modifier and Type | Class and Description |
---|---|
class |
GenericLinkExtractor
Generic link extractor for URLs found in HTML and possibly other text files.
|
class |
RegexLinkExtractor
Link extractor using regular expressions to extract links found in text
documents.
|
class |
TikaLinkExtractor
Implementation of
ILinkExtractor using
Apache Tika to perform URL
extractions from HTML documents. |
class |
XMLFeedLinkExtractor
|
Copyright © 2009–2021 Norconex Inc.. All rights reserved.