Package com.norconex.collector.http.link.impl
package com.norconex.collector.http.link.impl
-
ClassesClassDescriptionExtracts links from a Document Object Model (DOM) representation of an HTML, XHTML, or XML document content based on values of matching elements and attributes.Deprecated.Html link extractor for URLs found in HTML and possibly other text files.Link extractor using regular expressions to extract links found in text documents.Implementation of
ILinkExtractorusing Apache Tika to perform URL extractions from HTML documents.
HtmlLinkExtractororDOMLinkExtractorinstead.