Class AbstractDocumentTagger
- java.lang.Object
-
- com.norconex.importer.handler.AbstractImporterHandler
-
- com.norconex.importer.handler.tagger.AbstractDocumentTagger
-
- All Implemented Interfaces:
IXMLConfigurable
,IImporterHandler
,IDocumentTagger
- Direct Known Subclasses:
AbstractCharStreamTagger
,CharacterCaseTagger
,CharsetTagger
,ConstantTagger
,CopyTagger
,CurrentDateTagger
,DateFormatTagger
,DebugTagger
,DeleteTagger
,DocumentLengthTagger
,DOMTagger
,ExternalTagger
,FieldReportTagger
,ForceSingleValueTagger
,HierarchyTagger
,KeepOnlyTagger
,MergeTagger
,RenameTagger
,ReplaceTagger
,TruncateTagger
,UUIDTagger
public abstract class AbstractDocumentTagger extends AbstractImporterHandler implements IDocumentTagger
Base class for taggers.
Subclasses inherit this
IXMLConfigurable
configuration:<!-- multiple "restrictTo" tags allowed (only one needs to match) --> <restrictTo> <fieldMatcher>(field-matching expression)</fieldMatcher> <valueMatcher>(value-matching expression)</valueMatcher> </restrictTo>
- Since:
- 2.0.0
- Author:
- Pascal Essiembre
-
-
Constructor Summary
Constructors Constructor Description AbstractDocumentTagger()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract void
tagApplicableDocument(HandlerDoc doc, InputStream input, ParseState parseState)
void
tagDocument(HandlerDoc doc, InputStream input, ParseState parseState)
Tags a document with extra metadata information.-
Methods inherited from class com.norconex.importer.handler.AbstractImporterHandler
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, equals, getRestrictions, hashCode, isApplicable, loadFromXML, loadHandlerFromXML, removeRestriction, removeRestriction, saveHandlerToXML, saveToXML, toString
-
-
-
-
Method Detail
-
tagDocument
public final void tagDocument(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
Description copied from interface:IDocumentTagger
Tags a document with extra metadata information.- Specified by:
tagDocument
in interfaceIDocumentTagger
- Parameters:
doc
- documentinput
- document contentparseState
- whether the document has been parsed already or not (a parsed document should normally be text-based)- Throws:
ImporterHandlerException
- problem tagging the document
-
tagApplicableDocument
protected abstract void tagApplicableDocument(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
- Throws:
ImporterHandlerException
-
-