public abstract class AbstractCharStreamCondition extends Object implements IImporterCondition, IXMLConfigurable
Base class for conditions dealing with the document content as text. Subclasses can safely be used as either pre-parse or post-parse handler conditions restricted to text documents only.
When used as a pre-parse handler, this class will use detected or previously set content character encoding unless the character encoding was specified using {@link #setSourceCharset(String)}. Since document parsing converts content to UTF-8, UTF-8 is always assumed when used as a post-parse handler.
sourceCharset="(character encoding)"
Subclasses inherit the above IXMLConfigurable
attribute(s).
AbstractCharStreamFilter
)Constructor and Description |
---|
AbstractCharStreamCondition() |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object other) |
String |
getSourceCharset()
Gets the presumed source character encoding.
|
int |
hashCode() |
protected abstract void |
loadCharStreamConditionFromXML(XML xml)
Loads configuration settings specific to the implementing class.
|
void |
loadFromXML(XML xml) |
protected abstract void |
saveCharStreamConditionToXML(XML xml)
Saves configuration settings specific to the implementing class.
|
void |
saveToXML(XML xml) |
void |
setSourceCharset(String sourceCharset)
Sets the presumed source character encoding.
|
boolean |
testDocument(HandlerDoc doc,
InputStream input,
ParseState parseState)
Tests a given document.
|
protected abstract boolean |
testDocument(HandlerDoc doc,
Reader input,
ParseState parseState) |
String |
toString() |
public String getSourceCharset()
public void setSourceCharset(String sourceCharset)
sourceCharset
- character encoding of the source to be transformedpublic final boolean testDocument(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
IImporterCondition
testDocument
in interface IImporterCondition
doc
- the document to testinput
- document contentparseState
- whether the document has been parsed already or not (a
parsed document should normally be text-based)true
if the condition evaluates as suchImporterHandlerException
- problem reading the documentprotected abstract boolean testDocument(HandlerDoc doc, Reader input, ParseState parseState) throws ImporterHandlerException
ImporterHandlerException
public final void loadFromXML(XML xml)
loadFromXML
in interface IXMLConfigurable
public final void saveToXML(XML xml)
saveToXML
in interface IXMLConfigurable
protected abstract void saveCharStreamConditionToXML(XML xml)
xml
- the XMLprotected abstract void loadCharStreamConditionFromXML(XML xml)
xml
- XML configurationCopyright © 2009–2023 Norconex Inc.. All rights reserved.