Package com.norconex.collector.http.link.impl
-
Class Summary Class Description DOMLinkExtractor Extracts links from a Document Object Model (DOM) representation of an HTML, XHTML, or XML document content based on values of matching elements and attributes.GenericLinkExtractor Deprecated. Since 3.0.0, useHtmlLinkExtractor
orDOMLinkExtractor
instead.HtmlLinkExtractor Html link extractor for URLs found in HTML and possibly other text files.HtmlLinkExtractor.RegexPair RegexLinkExtractor Link extractor using regular expressions to extract links found in text documents.TikaLinkExtractor Implementation ofILinkExtractor
using Apache Tika to perform URL extractions from HTML documents.XMLFeedLinkExtractor