public abstract class AbstractCharStreamFilter extends AbstractDocumentFilter
Base class for filters dealing with the body of text documents only.
Subclasses can safely be used as either pre-parse or post-parse handlers
restricted to text documents only (see AbstractImporterHandler
).
When used as a pre-parse handler,
this class uses the detected or previously set content character
encoding unless the character encoding
was specified using setSourceCharset(String)
. Since document
parsing converts content to UTF-8, UTF-8 is always assumed when
used as a post-parse handler.
sourceCharset="(character encoding)"
onMatch="[include|exclude]"
Subclasses inherit the above IXMLConfigurable
attribute(s),
in addition to
<restrictTo>.
Constructor and Description |
---|
AbstractCharStreamFilter() |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object other) |
String |
getSourceCharset()
Gets the assumed source character encoding.
|
int |
hashCode() |
protected boolean |
isDocumentMatched(HandlerDoc doc,
InputStream input,
ParseState parseState) |
protected abstract boolean |
isTextDocumentMatching(HandlerDoc doc,
Reader input,
ParseState parseState) |
protected abstract void |
loadCharStreamFilterFromXML(XML xml)
Loads configuration settings specific to the implementing class.
|
protected void |
loadFilterFromXML(XML xml) |
protected abstract void |
saveCharStreamFilterToXML(XML xml)
Saves configuration settings specific to the implementing class.
|
protected void |
saveFilterToXML(XML xml) |
void |
setSourceCharset(String sourceCharset)
Sets the assumed source character encoding.
|
String |
toString() |
acceptDocument, getOnMatch, loadHandlerFromXML, saveHandlerToXML, setOnMatch
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public String getSourceCharset()
public void setSourceCharset(String sourceCharset)
sourceCharset
- character encoding of the source to be transformedprotected final boolean isDocumentMatched(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
isDocumentMatched
in class AbstractDocumentFilter
ImporterHandlerException
protected abstract boolean isTextDocumentMatching(HandlerDoc doc, Reader input, ParseState parseState) throws ImporterHandlerException
ImporterHandlerException
protected final void saveFilterToXML(XML xml)
saveFilterToXML
in class AbstractDocumentFilter
protected abstract void saveCharStreamFilterToXML(XML xml)
xml
- the XMLprotected final void loadFilterFromXML(XML xml)
loadFilterFromXML
in class AbstractDocumentFilter
protected abstract void loadCharStreamFilterFromXML(XML xml)
xml
- XML configurationpublic boolean equals(Object other)
equals
in class AbstractDocumentFilter
public int hashCode()
hashCode
in class AbstractDocumentFilter
public String toString()
toString
in class AbstractDocumentFilter
Copyright © 2009–2023 Norconex Inc.. All rights reserved.