public abstract class AbstractDocumentFilter extends AbstractImporterHandler implements IDocumentFilter, IOnMatchFilter
Base class for document filters. Subclasses can be set an attribute
called "onMatch". The logic whether to include or exclude a document
upon matching it is handled by this class. Subclasses only
need to focus on whether the document gets matched or not by
implementing the
isDocumentMatched(HandlerDoc, InputStream, ParseState)
method.
The logic for accepting or rejecting documents when a subclass condition is met ("matches") is as follow:
Matches? | On match | Expected behavior |
yes | exclude | Document is rejected. |
yes | include | Document is accepted. |
no | exclude | Document is accepted. |
no | include | Document is accepted if it was accepted by at least one filter with onMatch="include". If no other one exists or if none matched, the document is rejected. |
When multiple filters are defined and a combination of both "include" and "exclude" are possible, the "exclude" will always take precedence. In other words, it only take one matching "exclude" to reject a document, not matter how many matching "include" were triggered.
onMatch="[include|exclude]"
Subclasses inherit the above IXMLConfigurable
attribute(s),
in addition to
<restrictTo>.
Constructor and Description |
---|
AbstractDocumentFilter() |
Modifier and Type | Method and Description |
---|---|
boolean |
acceptDocument(HandlerDoc doc,
InputStream input,
ParseState parseState)
Whether to accepts a document.
|
boolean |
equals(Object other) |
OnMatch |
getOnMatch()
Gets the the on match action (exclude or include).
|
int |
hashCode() |
protected abstract boolean |
isDocumentMatched(HandlerDoc doc,
InputStream input,
ParseState parseState) |
protected abstract void |
loadFilterFromXML(XML xml) |
protected void |
loadHandlerFromXML(XML xml)
Loads configuration settings specific to the implementing class.
|
protected abstract void |
saveFilterToXML(XML xml) |
protected void |
saveHandlerToXML(XML xml)
Saves configuration settings specific to the implementing class.
|
void |
setOnMatch(OnMatch onMatch) |
String |
toString() |
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public OnMatch getOnMatch()
IOnMatchFilter
getOnMatch
in interface IOnMatchFilter
public final void setOnMatch(OnMatch onMatch)
public boolean acceptDocument(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
IDocumentFilter
acceptDocument
in interface IDocumentFilter
doc
- the document to evaluateinput
- document contentparseState
- whether the document has been parsed already or not (a
parsed document should normally be text-based)true
if document is acceptedImporterHandlerException
- problem reading the documentprotected abstract boolean isDocumentMatched(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
ImporterHandlerException
protected final void saveHandlerToXML(XML xml)
AbstractImporterHandler
saveHandlerToXML
in class AbstractImporterHandler
xml
- the XMLprotected abstract void saveFilterToXML(XML xml)
protected final void loadHandlerFromXML(XML xml)
AbstractImporterHandler
loadHandlerFromXML
in class AbstractImporterHandler
xml
- XML configurationprotected abstract void loadFilterFromXML(XML xml)
public boolean equals(Object other)
equals
in class AbstractImporterHandler
public int hashCode()
hashCode
in class AbstractImporterHandler
public String toString()
toString
in class AbstractImporterHandler
Copyright © 2009–2023 Norconex Inc.. All rights reserved.