Class DocumentLengthTagger
- java.lang.Object
-
- com.norconex.importer.handler.AbstractImporterHandler
-
- com.norconex.importer.handler.tagger.AbstractDocumentTagger
-
- com.norconex.importer.handler.tagger.impl.DocumentLengthTagger
-
- All Implemented Interfaces:
IXMLConfigurable
,IImporterHandler
,IDocumentTagger
public class DocumentLengthTagger extends AbstractDocumentTagger
Adds the document length (i.e., number of bytes) to the specified
field
. The length is the document content length as it is in its current processing stage. If for instance you set this tagger after a transformer that modifies the content, the obtained length will be for the modified content, and not the original length. To obtain a document's length before any modification was made to it, use this tagger as one of the first handler in your pre-parse handlers.Storing values in an existing field
If a target field with the same name already exists for a document, values will be added to the end of the existing value list. It is possible to change this default behavior by supplying a
PropertySetter
.Can be used both as a pre-parse or post-parse handler.
XML configuration usage:
<handler class="com.norconex.importer.handler.tagger.impl.DocumentLengthTagger" toField="(mandatory target field)"> <!-- multiple "restrictTo" tags allowed (only one needs to match) --> <restrictTo> <fieldMatcher>(field-matching expression)</fieldMatcher> <valueMatcher>(value-matching expression)</valueMatcher> </restrictTo> </handler>
XML usage example:
<handler class="DocumentLengthTagger" toField="docSize"/>
The following stores the document lenght into a "docSize" field.
- Since:
- 2.2.0
- Author:
- Pascal Essiembre
-
-
Constructor Summary
Constructors Constructor Description DocumentLengthTagger()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description boolean
equals(Object other)
String
getField()
Deprecated.Since 3.0.0, usegetToField()
PropertySetter
getOnSet()
Gets the property setter to use when a value is set.String
getToField()
Gets the target field.int
hashCode()
boolean
isOverwrite()
Deprecated.Since 3.0.0 usegetOnSet()
.protected void
loadHandlerFromXML(XML xml)
Loads configuration settings specific to the implementing class.protected void
saveHandlerToXML(XML xml)
Saves configuration settings specific to the implementing class.void
setField(String toField)
Deprecated.Since 3.0.0, usesetToField(String)
void
setOnSet(PropertySetter onSet)
Sets the property setter to use when a value is set.void
setOverwrite(boolean overwrite)
Deprecated.Since 3.0.0 usesetOnSet(PropertySetter)
.void
setToField(String toField)
Sets the target field.void
tagApplicableDocument(HandlerDoc doc, InputStream document, ParseState parseState)
String
toString()
-
Methods inherited from class com.norconex.importer.handler.tagger.AbstractDocumentTagger
tagDocument
-
Methods inherited from class com.norconex.importer.handler.AbstractImporterHandler
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
-
-
-
-
Method Detail
-
tagApplicableDocument
public void tagApplicableDocument(HandlerDoc doc, InputStream document, ParseState parseState) throws ImporterHandlerException
- Specified by:
tagApplicableDocument
in classAbstractDocumentTagger
- Throws:
ImporterHandlerException
-
getToField
public String getToField()
Gets the target field.- Returns:
- target field
- Since:
- 3.0.0
-
setToField
public void setToField(String toField)
Sets the target field.- Parameters:
toField
- target field- Since:
- 3.0.0
-
getField
@Deprecated public String getField()
Deprecated.Since 3.0.0, usegetToField()
Gets the target field.- Returns:
- target field
-
setField
@Deprecated public void setField(String toField)
Deprecated.Since 3.0.0, usesetToField(String)
Sets the target field.- Parameters:
toField
- target field
-
isOverwrite
@Deprecated public boolean isOverwrite()
Deprecated.Since 3.0.0 usegetOnSet()
.Gets whether existing value for the same field should be overwritten.- Returns:
true
if overwriting existing value.
-
setOverwrite
@Deprecated public void setOverwrite(boolean overwrite)
Deprecated.Since 3.0.0 usesetOnSet(PropertySetter)
.Sets whether existing value for the same field should be overwritten.- Parameters:
overwrite
-true
if overwriting existing value.
-
getOnSet
public PropertySetter getOnSet()
Gets the property setter to use when a value is set.- Returns:
- property setter
- Since:
- 3.0.0
-
setOnSet
public void setOnSet(PropertySetter onSet)
Sets the property setter to use when a value is set.- Parameters:
onSet
- property setter- Since:
- 3.0.0
-
loadHandlerFromXML
protected void loadHandlerFromXML(XML xml)
Description copied from class:AbstractImporterHandler
Loads configuration settings specific to the implementing class.- Specified by:
loadHandlerFromXML
in classAbstractImporterHandler
- Parameters:
xml
- XML configuration
-
saveHandlerToXML
protected void saveHandlerToXML(XML xml)
Description copied from class:AbstractImporterHandler
Saves configuration settings specific to the implementing class.- Specified by:
saveHandlerToXML
in classAbstractImporterHandler
- Parameters:
xml
- the XML
-
equals
public boolean equals(Object other)
- Overrides:
equals
in classAbstractImporterHandler
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classAbstractImporterHandler
-
toString
public String toString()
- Overrides:
toString
in classAbstractImporterHandler
-
-