public interface ILinkExtractor
IXMLConfigurable
should make
sure to name their XML tag "extractor
", normally nested
in linkExtractors
tags.Modifier and Type | Method and Description |
---|---|
boolean |
accepts(String url,
ContentType contentType)
Whether this link extraction should be executed for the given URL
and/or content type.
|
Set<Link> |
extractLinks(InputStream input,
String reference,
ContentType contentType)
Extracts links from a document.
|
Set<Link> extractLinks(InputStream input, String reference, ContentType contentType) throws IOException
input
- the document input streamreference
- document reference (URL)contentType
- the document content typeIOException
- problem reading the documentboolean accepts(String url, ContentType contentType)
url
- the urlcontentType
- the content typetrue
if the given URL is acceptedCopyright © 2009–2021 Norconex Inc.. All rights reserved.