com.norconex.importer.handler.tagger.impl (Norconex Importer 3.1.0 API)

Class Summary
Class	Description
CharacterCaseTagger	Changes the character case of matching fields and values according to one of the following methods:
CharsetTagger	Converts one or more field values (if needed) from a source character encoding (charset) to a target one.
ConstantTagger	Define and add constant values to documents.
CopyTagger	Copies metadata fields.
CountMatchesTagger	Counts the number of matches of a given string (or string pattern) and store the resulting value in a field in the specified "toField".
CountMatchesTagger.MatchDetails	Deprecated.
CurrentDateTagger	Adds the current computer UTC date to the specified `field`.
DateFormatTagger	Formats a date from any given format to a format of choice, as per the formatting options found on `SimpleDateFormat` with the exception of the string "EPOCH" which represents the difference, measured in milliseconds, between the date and midnight, January 1, 1970.
DebugTagger	A utility tagger to help with troubleshooting of document importing.
DeleteTagger	Delete the metadata fields provided.
DocumentLengthTagger	Adds the document length (i.e., number of bytes) to the specified `field`.
DOMTagger	Extract the value of one or more elements or attributes into a target field, or delete matching elements.
DOMTagger.DOMExtractDetails	DOM Extraction Details
ExternalTagger	Extracts metadata from a document using an external application to do so.
FieldReportTagger	A utility tagger that reports in a CSV file the fields discovered in a crawl session, captured at the point of your choice in the importing process.
ForceSingleValueTagger	Forces a metadata field to be single-value.
HierarchyTagger	Given a separator, split a field string into multiple segments representing each node of a hierarchical branch.
HierarchyTagger.HierarchyDetails
KeepOnlyTagger	Keep only the metadata fields provided, delete all other ones.
LanguageTagger	Detects a document language based on Apache Tika language detection capability.
MergeTagger	Merge multiple metadata fields into a single one.
MergeTagger.Merge
RegexTagger	Extracts field names and their values with regular expression.
RenameTagger	Rename metadata fields to different names.
RenameTagger.RenameDetails
ReplaceTagger	Replaces an existing metadata value with another one.
ReplaceTagger.Replacement
ScriptTagger	Tag incoming documents using a scripting language.
SplitTagger	Splits an existing metadata value into multiple values based on a given value separator (the separator gets discarded).
SplitTagger.SplitDetails
TextBetweenTagger	Extracts and add values found between a matching start and end strings to a document metadata field.
TextBetweenTagger.TextBetweenDetails
TextPatternTagger	Deprecated. Since 3.0.0, use `RegexTagger`.
TextStatisticsTagger	Analyzes the content of the supplied document and adds statistical information about its content or field as metadata fields.
TitleGeneratorTagger	Attempts to generate a title from the document content (default) or a specified metadata field.
TruncateTagger	Truncates a `fromField` value(s) and optionally replace truncated portion by a hash value to help ensure uniqueness (not 100% guaranteed to be collision-free).
URLExtractorTagger	Extracts unique URLs matching specific patterns in plain text content and store them in a given field.
UUIDTagger	Generates a random Universally unique identifier (UUID) and stores it in the specified `field`.

Enum Summary
Enum Description

ConstantTagger.OnConflict Deprecated.

Package com.norconex.importer.handler.tagger.impl