Package com.norconex.collector.http.link
Interface ILinkExtractor
- All Known Implementing Classes:
AbstractLinkExtractor,AbstractTextLinkExtractor,DOMLinkExtractor,GenericLinkExtractor,HtmlLinkExtractor,RegexLinkExtractor,TikaLinkExtractor,XMLFeedLinkExtractor
public interface ILinkExtractor
Responsible for finding links in documents. Links are URLs to be followed
with possibly contextual information about that URL (the "a" tag attributes,
and text).
Implementing classes also implementing
Implementing classes also implementing
IXMLConfigurable should make
sure to name their XML tag "extractor", normally nested
in linkExtractors tags.- Author:
- Pascal Essiembre
-
Method Summary
-
Method Details
-
extractLinks
- Throws:
IOException
-