Class AbstractDocumentTransformer
java.lang.Object
com.norconex.importer.handler.AbstractImporterHandler
com.norconex.importer.handler.transformer.AbstractDocumentTransformer
- All Implemented Interfaces:
IXMLConfigurable,IImporterHandler,IDocumentTransformer
- Direct Known Subclasses:
AbstractCharStreamTransformer,CharsetTransformer,DOMDeleteTransformer,DOMPreserveTransformer,ExternalTransformer,ImageTransformer,NoContentTransformer
public abstract class AbstractDocumentTransformer
extends AbstractImporterHandler
implements IDocumentTransformer
Base class for transformers.
Subclasses inherit this IXMLConfigurable configuration:
<!-- multiple "restrictTo" tags allowed (only one needs to match) -->
<restrictTo>
<fieldMatcher>(field-matching expression)</fieldMatcher>
<valueMatcher>(value-matching expression)</valueMatcher>
</restrictTo>- Since:
- 2.0.0
- Author:
- Pascal Essiembre
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected abstract voidtransformApplicableDocument(HandlerDoc doc, InputStream input, OutputStream output, ParseState parseState) final voidtransformDocument(HandlerDoc doc, InputStream input, OutputStream output, ParseState parseState) Transforms document content and metadata.Methods inherited from class com.norconex.importer.handler.AbstractImporterHandler
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, equals, getRestrictions, hashCode, isApplicable, loadFromXML, loadHandlerFromXML, removeRestriction, removeRestriction, saveHandlerToXML, saveToXML, toString
-
Constructor Details
-
AbstractDocumentTransformer
public AbstractDocumentTransformer()
-
-
Method Details
-
transformDocument
public final void transformDocument(HandlerDoc doc, InputStream input, OutputStream output, ParseState parseState) throws ImporterHandlerException Description copied from interface:IDocumentTransformerTransforms document content and metadata.- Specified by:
transformDocumentin interfaceIDocumentTransformer- Parameters:
doc- documentinput- document content to transformoutput- transformed document contentparseState- whether the document has been parsed already or not (a parsed document should normally be text-based)- Throws:
ImporterHandlerException- could not transform the document
-
transformApplicableDocument
protected abstract void transformApplicableDocument(HandlerDoc doc, InputStream input, OutputStream output, ParseState parseState) throws ImporterHandlerException - Throws:
ImporterHandlerException
-