Package com.norconex.collector.http.link
Class AbstractLinkExtractor
java.lang.Object
com.norconex.collector.http.link.AbstractLinkExtractor
- All Implemented Interfaces:
ILinkExtractor,IXMLConfigurable
- Direct Known Subclasses:
AbstractTextLinkExtractor,TikaLinkExtractor
public abstract class AbstractLinkExtractor
extends Object
implements ILinkExtractor, IXMLConfigurable
Base class for link extraction providing common configuration settings.
Subclasses inherit the following:
XML configuration usage:
XML usage example:
The above example will apply to any content type starting with "text/".
- Since:
- 3.0.0
- Author:
- Pascal Essiembre
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidaddRestriction(PropertyMatcher... restrictions) Adds one or more restrictions this extractor should be restricted to.voidaddRestrictions(List<PropertyMatcher> restrictions) Adds restrictions this extractor should be restricted to.voidClears all restrictions.booleanextractLinks(CrawlDoc doc) abstract voidextractLinks(Set<Link> links, CrawlDoc doc) Gets all restrictionsinthashCode()final voidloadFromXML(XML xml) protected abstract voidLoads configuration settings specific to the implementing class.booleanremoveRestriction(PropertyMatcher restriction) Removes a restriction.intremoveRestriction(String field) Removes all restrictions on a given field.protected abstract voidSaves configuration settings specific to the implementing class.final voidvoidsetRestrictions(List<PropertyMatcher> restrictions) Sets restrictions this extractor should be restricted to.toString()
-
Constructor Details
-
AbstractLinkExtractor
public AbstractLinkExtractor()
-
-
Method Details
-
extractLinks
- Specified by:
extractLinksin interfaceILinkExtractor- Throws:
IOException
-
extractLinks
- Throws:
IOException
-
addRestriction
Adds one or more restrictions this extractor should be restricted to.- Parameters:
restrictions- the restrictions
-
addRestrictions
Adds restrictions this extractor should be restricted to.- Parameters:
restrictions- the restrictions
-
setRestrictions
Sets restrictions this extractor should be restricted to.- Parameters:
restrictions- the restrictions
-
removeRestriction
Removes all restrictions on a given field.- Parameters:
field- the field to remove restrictions on- Returns:
- how many elements were removed
-
removeRestriction
Removes a restriction.- Parameters:
restriction- the restriction to remove- Returns:
trueif this extractor contained the restriction
-
clearRestrictions
public void clearRestrictions()Clears all restrictions. -
getRestrictions
Gets all restrictions- Returns:
- the restrictions
-
loadFromXML
- Specified by:
loadFromXMLin interfaceIXMLConfigurable
-
loadLinkExtractorFromXML
Loads configuration settings specific to the implementing class.- Parameters:
xml- XML configuration
-
saveToXML
- Specified by:
saveToXMLin interfaceIXMLConfigurable
-
saveLinkExtractorToXML
Saves configuration settings specific to the implementing class.- Parameters:
xml- the XML
-
equals
-
hashCode
public int hashCode() -
toString
-