public class ForceSingleValueTagger extends AbstractDocumentTagger
Forces a metadata field to be single-value. The action can be one of the following:
Can be used both as a pre-parse or post-parse handler.
keepFirst Keeps the first occurrence found. keepLast Keeps the first occurrence found. mergeWith:<sep> Merges all occurrences, joining them with the specified separator (<sep>).
If you do not specify any action, the default behavior is to merge all occurrences, joining values with a comma.
<handler
class="com.norconex.importer.handler.tagger.impl.ForceSingleValueTagger"
action="[keepFirst|keepLast|mergeWith:separator]">
<!-- multiple "restrictTo" tags allowed (only one needs to match) -->
<restrictTo>
<fieldMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(field-matching expression)
</fieldMatcher>
<valueMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(value-matching expression)
</valueMatcher>
</restrictTo>
<fieldMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(one or more matching fields to force having a single value)
</fieldMatcher>
</handler>
<handler
class="ForceSingleValueTagger"
action="keepFirst">
<fieldMatcher>title</fieldMatcher>
</handler>
For documents where multiple title fields are found, the above only keeps the first title value captured.
Constructor and Description |
---|
ForceSingleValueTagger() |
Modifier and Type | Method and Description |
---|---|
void |
addSingleValueField(String field,
String action)
Deprecated.
Since 3.0.0, use
setFieldMatcher(TextMatcher) and
setAction(String) . |
boolean |
equals(Object other) |
String |
getAction()
Gets action.
|
TextMatcher |
getFieldMatcher()
Gets field matcher.
|
Map<String,String> |
getSingleValueFields()
Deprecated.
Since 3.0.0, use
getFieldMatcher() . |
int |
hashCode() |
protected void |
loadHandlerFromXML(XML xml)
Loads configuration settings specific to the implementing class.
|
void |
removeSingleValueField(String name)
Deprecated.
Since 3.0.0, use
setFieldMatcher(TextMatcher) . |
protected void |
saveHandlerToXML(XML xml)
Saves configuration settings specific to the implementing class.
|
void |
setAction(String action)
Sets the action.
|
void |
setFieldMatcher(TextMatcher fieldMatcher)
Sets field matcher.
|
void |
tagApplicableDocument(HandlerDoc doc,
InputStream document,
ParseState parseState) |
String |
toString() |
tagDocument
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public void tagApplicableDocument(HandlerDoc doc, InputStream document, ParseState parseState) throws ImporterHandlerException
tagApplicableDocument
in class AbstractDocumentTagger
ImporterHandlerException
public TextMatcher getFieldMatcher()
public void setFieldMatcher(TextMatcher fieldMatcher)
fieldMatcher
- field matcherpublic String getAction()
public void setAction(String action)
action
- action to be performed@Deprecated public Map<String,String> getSingleValueFields()
getFieldMatcher()
.@Deprecated public void addSingleValueField(String field, String action)
setFieldMatcher(TextMatcher)
and
setAction(String)
.field
- fieldaction
- action@Deprecated public void removeSingleValueField(String name)
setFieldMatcher(TextMatcher)
.name
- field nameprotected void loadHandlerFromXML(XML xml)
AbstractImporterHandler
loadHandlerFromXML
in class AbstractImporterHandler
xml
- XML configurationprotected void saveHandlerToXML(XML xml)
AbstractImporterHandler
saveHandlerToXML
in class AbstractImporterHandler
xml
- the XMLpublic boolean equals(Object other)
equals
in class AbstractImporterHandler
public int hashCode()
hashCode
in class AbstractImporterHandler
public String toString()
toString
in class AbstractImporterHandler
Copyright © 2009–2023 Norconex Inc.. All rights reserved.