Class CharacterCaseTagger
- java.lang.Object
-
- com.norconex.importer.handler.AbstractImporterHandler
-
- com.norconex.importer.handler.tagger.AbstractDocumentTagger
-
- com.norconex.importer.handler.tagger.impl.CharacterCaseTagger
-
- All Implemented Interfaces:
IXMLConfigurable
,IImporterHandler
,IDocumentTagger
public class CharacterCaseTagger extends AbstractDocumentTagger
Changes the character case of matching fields and values according to one of the following methods:
- upper: Changes all characters to upper case.
- lower: Changes all characters values to lower case.
- words: Converts the first letter of each words to upper case, and leaves the character case of other characters unchanged.
- wordsFully: Converts the first letter of each words to upper case, and the rest to lower case.
- sentences: Converts the first letter of each sentence to upper case, and leaves the character case of other characters unchanged.
- sentencesFully: Converts the first letter of each sentence to upper case, and converts other characters to lower case.
- string: Converts the first letter of a string to upper case, and leaves the character case of other characters unchanged.
- stringFully: Converts the first letter of a string to upper case, and converts other characters to lower case.
- swap: Converts all upper case characters to lower case, and all lower case to upper case.
The change of character case can be applied to one of the following (defaults to "value" when unspecified):
- value: Applies to the field values.
- field: Applies to the field name.
- both: Applies to both the field name and its values.
Field names are referenced in a case insensitive manner.
XML configuration usage:
<handler class="com.norconex.importer.handler.tagger.impl.CharacterCaseTagger" type="[upper|lower|words|wordsFully|sentences|sentencesFully|string|stringFully|swap]" applyTo="[value|field|both]"> <!-- multiple "restrictTo" tags allowed (only one needs to match) --> <restrictTo> <fieldMatcher>(field-matching expression)</fieldMatcher> <valueMatcher>(value-matching expression)</valueMatcher> </restrictTo> <fieldMatcher>(expression to narrow by matching fields)</fieldMatcher> </handler>
XML usage example:
<!-- Converts title to lowercase --> <handler class="CharacterCaseTagger" type="lower" applyTo="field"> <fieldMatcher>title</fieldMatcher> </handler> <!-- Make first title character uppercase --> <handler class="CharacterCaseTagger" type="string" applyTo="value"> <fieldMatcher>title</fieldMatcher> </handler>
The above examples first convert a title to lower case except for the first character.
- Since:
- 2.0.0
- Author:
- Pascal Essiembre
-
-
Field Summary
Fields Modifier and Type Field Description static String
APPLY_BOTH
static String
APPLY_FIELD
static String
APPLY_VALUE
static String
CASE_LOWER
static String
CASE_SENTENCES
static String
CASE_SENTENCES_FULLY
static String
CASE_STRING
static String
CASE_STRING_FULLY
static String
CASE_SWAP
static String
CASE_UPPER
static String
CASE_WORDS
static String
CASE_WORDS_FULLY
-
Constructor Summary
Constructors Constructor Description CharacterCaseTagger()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addFieldCase(String field, String caseType)
Deprecated.Since 3.0.0, usesetFieldMatcher(TextMatcher)
,setCaseType(String)
.void
addFieldCase(String field, String caseType, String applyTo)
Deprecated.boolean
equals(Object other)
String
getApplyTo()
Gets whether to apply the case transformation to fields, values, or both.String
getApplyTo(String fieldName)
Deprecated.Since 3.0.0 usegetApplyTo()
String
getCaseType()
Gets the type of character case transformation.String
getCaseType(String fieldName)
Deprecated.Since 3.0.0 usegetCaseType()
TextMatcher
getFieldMatcher()
Gets field matcher.Set<String>
getFieldNames()
Deprecated.Since 3.0.0 usegetFieldMatcher()
int
hashCode()
protected void
loadHandlerFromXML(XML xml)
Loads configuration settings specific to the implementing class.protected void
saveHandlerToXML(XML xml)
Saves configuration settings specific to the implementing class.void
setApplyTo(String applyTo)
Sets whether to apply the case transformation to fields, values, or both.void
setCaseType(String caseType)
Sets the type of character case transformation.void
setFieldMatcher(TextMatcher fieldMatcher)
Sets field matcher.void
tagApplicableDocument(HandlerDoc doc, InputStream document, ParseState parseState)
String
toString()
-
Methods inherited from class com.norconex.importer.handler.tagger.AbstractDocumentTagger
tagDocument
-
Methods inherited from class com.norconex.importer.handler.AbstractImporterHandler
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
-
-
-
-
Field Detail
-
CASE_WORDS
public static final String CASE_WORDS
- See Also:
- Constant Field Values
-
CASE_WORDS_FULLY
public static final String CASE_WORDS_FULLY
- See Also:
- Constant Field Values
-
CASE_UPPER
public static final String CASE_UPPER
- See Also:
- Constant Field Values
-
CASE_LOWER
public static final String CASE_LOWER
- See Also:
- Constant Field Values
-
CASE_SWAP
public static final String CASE_SWAP
- See Also:
- Constant Field Values
-
CASE_SENTENCES
public static final String CASE_SENTENCES
- See Also:
- Constant Field Values
-
CASE_SENTENCES_FULLY
public static final String CASE_SENTENCES_FULLY
- See Also:
- Constant Field Values
-
CASE_STRING
public static final String CASE_STRING
- See Also:
- Constant Field Values
-
CASE_STRING_FULLY
public static final String CASE_STRING_FULLY
- See Also:
- Constant Field Values
-
APPLY_VALUE
public static final String APPLY_VALUE
- See Also:
- Constant Field Values
-
APPLY_FIELD
public static final String APPLY_FIELD
- See Also:
- Constant Field Values
-
APPLY_BOTH
public static final String APPLY_BOTH
- See Also:
- Constant Field Values
-
-
Method Detail
-
getFieldMatcher
public TextMatcher getFieldMatcher()
Gets field matcher.- Returns:
- field matcher
- Since:
- 3.0.0
-
setFieldMatcher
public void setFieldMatcher(TextMatcher fieldMatcher)
Sets field matcher.- Parameters:
fieldMatcher
- field matcher- Since:
- 3.0.0
-
getCaseType
public String getCaseType()
Gets the type of character case transformation.- Returns:
- type of case transformation
- Since:
- 3.0.0
-
setCaseType
public void setCaseType(String caseType)
Sets the type of character case transformation.- Parameters:
caseType
- type of case transformation- Since:
- 3.0.0
-
getApplyTo
public String getApplyTo()
Gets whether to apply the case transformation to fields, values, or both.- Returns:
- one of "field", "value", or "both"
- Since:
- 3.0.0
-
setApplyTo
public void setApplyTo(String applyTo)
Sets whether to apply the case transformation to fields, values, or both.- Parameters:
applyTo
- one of "field", "value", or "both"- Since:
- 3.0.0
-
tagApplicableDocument
public void tagApplicableDocument(HandlerDoc doc, InputStream document, ParseState parseState) throws ImporterHandlerException
- Specified by:
tagApplicableDocument
in classAbstractDocumentTagger
- Throws:
ImporterHandlerException
-
addFieldCase
@Deprecated public void addFieldCase(String field, String caseType)
Deprecated.Since 3.0.0, usesetFieldMatcher(TextMatcher)
,setCaseType(String)
.Adds field case changing instructions.- Parameters:
field
- the field to apply the case changingcaseType
- the type of case change to apply
-
addFieldCase
@Deprecated public void addFieldCase(String field, String caseType, String applyTo)
Deprecated.Adds field case changing instructions.- Parameters:
field
- the field to apply the case changingcaseType
- the type of case change to applyapplyTo
- what to apply the case change to- Since:
- 2.4.0
-
getFieldNames
@Deprecated public Set<String> getFieldNames()
Deprecated.Since 3.0.0 usegetFieldMatcher()
Gets the field matcher pattern.- Returns:
- field matcher pattern
-
getCaseType
@Deprecated public String getCaseType(String fieldName)
Deprecated.Since 3.0.0 usegetCaseType()
Get the case conversion type.- Parameters:
fieldName
- field name- Returns:
- case type
-
getApplyTo
@Deprecated public String getApplyTo(String fieldName)
Deprecated.Since 3.0.0 usegetApplyTo()
Gets what if the case change applies to fields, values, or both.- Parameters:
fieldName
- the field name- Returns:
- "field", "value", or "both"
- Since:
- 2.4.0
-
loadHandlerFromXML
protected void loadHandlerFromXML(XML xml)
Description copied from class:AbstractImporterHandler
Loads configuration settings specific to the implementing class.- Specified by:
loadHandlerFromXML
in classAbstractImporterHandler
- Parameters:
xml
- XML configuration
-
saveHandlerToXML
protected void saveHandlerToXML(XML xml)
Description copied from class:AbstractImporterHandler
Saves configuration settings specific to the implementing class.- Specified by:
saveHandlerToXML
in classAbstractImporterHandler
- Parameters:
xml
- the XML
-
equals
public boolean equals(Object other)
- Overrides:
equals
in classAbstractImporterHandler
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classAbstractImporterHandler
-
toString
public String toString()
- Overrides:
toString
in classAbstractImporterHandler
-
-