Package | Description |
---|---|
com.norconex.importer.handler.tagger | |
com.norconex.importer.handler.tagger.impl |
Modifier and Type | Class and Description |
---|---|
class |
AbstractStringTagger
Base class to facilitate creating taggers based on text content, loading
text into
StringBuilder for memory processing. |
Modifier and Type | Class and Description |
---|---|
class |
CountMatchesTagger
Counts the number of matches of a given string (or string pattern) and
store the resulting value in a field in the specified "toField".
|
class |
LanguageTagger
Detects a document language based on Apache Tika language detection
capability.
|
class |
RegexTagger
Extracts field names and their values with regular expression.
|
class |
ScriptTagger
Tag incoming documents using a scripting language.
|
class |
SplitTagger
Splits an existing metadata value into multiple values based on a given
value separator (the separator gets discarded).
|
class |
TextBetweenTagger
Extracts and add values found between a matching start and
end strings to a document metadata field.
|
class |
TextPatternTagger
Deprecated.
Since 3.0.0, use
RegexTagger . |
class |
TextStatisticsTagger
Analyzes the content of the supplied document and adds statistical
information about its content or field as metadata fields.
|
class |
TitleGeneratorTagger
Attempts to generate a title from the document content (default) or
a specified metadata field.
|
class |
URLExtractorTagger
Extracts unique URLs matching specific patterns in plain text content and
store them in a given field.
|
Copyright © 2009–2023 Norconex Inc.. All rights reserved.