public abstract class AbstractDocumentTagger extends AbstractImporterHandler implements IDocumentTagger
Base class for taggers.
Subclasses inherit thisIXMLConfigurable
configuration:
<restrictTo caseSensitive="[false|true]" field="(name of header/metadata field name to match)"> (regular expression of value to match) </restrictTo> <!-- multiple "restrictTo" tags allowed (only one needs to match) -->
Constructor and Description |
---|
AbstractDocumentTagger() |
Modifier and Type | Method and Description |
---|---|
protected abstract void |
tagApplicableDocument(String reference,
InputStream document,
ImporterMetadata metadata,
boolean parsed) |
void |
tagDocument(String reference,
InputStream document,
ImporterMetadata metadata,
boolean parsed)
Tags a document with extra metadata information.
|
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, equals, getRestrictions, hashCode, isApplicable, loadFromXML, loadHandlerFromXML, removeRestriction, removeRestriction, saveHandlerToXML, saveToXML, toString
public final void tagDocument(String reference, InputStream document, ImporterMetadata metadata, boolean parsed) throws ImporterHandlerException
IDocumentTagger
tagDocument
in interface IDocumentTagger
reference
- document reference (e.g. URL)document
- documentmetadata
- document metadataparsed
- whether the document has been parsed already or not (a
parsed document should normally be text-based)ImporterHandlerException
- problem tagging the documentprotected abstract void tagApplicableDocument(String reference, InputStream document, ImporterMetadata metadata, boolean parsed) throws ImporterHandlerException
ImporterHandlerException
Copyright © 2009–2021 Norconex Inc.. All rights reserved.