public class GenericMetadataChecksummer extends AbstractMetadataChecksummer
Generic implementation of IMetadataChecksummer
that uses
specified field names and their values to create a checksum. The name
and values are simply returned as is, joined using this format:
fieldName=fieldValue;fieldName=fieldValue;...
.
You have the option to keep the checksum as a document metadata field.
When AbstractMetadataChecksummer.setKeep(boolean)
is true
, the checksum will be
stored in the target field name specified. If you do not specify any,
it stores it under the metadata field name
CrawlDocMetadata.CHECKSUM_METADATA
.
<metadataChecksummer
class="com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer"
keep="[false|true]"
toField="(optional field to store the checksum)">
<fieldMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(expression matching fields used to create the checksum)
</fieldMatcher>
</metadataChecksummer>
toField
is ignored unless the keep
attribute is set to true
.
<metadataChecksummer
class="GenericMetadataChecksummer">
<fieldMatcher
method="csv">
docLastModified,docSize
</fieldMatcher>
</metadataChecksummer>
The above example uses a combination of two (fictitious) fields called "docLastModified" and "docSize" to make the checksum.
Since 2.0.0, a self-closing
<metadataChecksummer/>
tag without any attributes
is used to disable checksum generation.
Constructor and Description |
---|
GenericMetadataChecksummer() |
Modifier and Type | Method and Description |
---|---|
protected String |
doCreateMetaChecksum(Properties metadata) |
boolean |
equals(Object other) |
TextMatcher |
getFieldMatcher()
Gets the field matcher.
|
List<String> |
getSourceFields()
Deprecated.
Since 2.0.0, use
getFieldMatcher() . |
String |
getSourceFieldsRegex()
Deprecated.
Since 2.0.0, use
getFieldMatcher() . |
int |
hashCode() |
boolean |
isDisabled()
Deprecated.
Since 2.0.0, not having a checksummer defined or
setting one explicitly to
null effectively disables
it. |
protected void |
loadChecksummerFromXML(XML xml) |
protected void |
saveChecksummerToXML(XML xml) |
void |
setDisabled(boolean disabled)
Deprecated.
Since 2.0.0, not having a checksummer defined or
setting one explicitly to
null effectively disable
it. |
void |
setFieldMatcher(TextMatcher fieldMatcher)
Sets the field matcher.
|
void |
setSourceFields(List<String> sourceFields)
Deprecated.
Since 2.0.0, use
setFieldMatcher(TextMatcher) . |
void |
setSourceFields(String... sourceFields)
Deprecated.
Since 2.0.0, use
setFieldMatcher(TextMatcher) . |
void |
setSourceFieldsRegex(String sourceFieldsRegex)
Deprecated.
Since 2.0.0, use
setFieldMatcher(TextMatcher) . |
String |
toString() |
createMetadataChecksum, getOnSet, getTargetField, getToField, isKeep, loadFromXML, saveToXML, setKeep, setOnSet, setTargetField, setToField
protected String doCreateMetaChecksum(Properties metadata)
doCreateMetaChecksum
in class AbstractMetadataChecksummer
public TextMatcher getFieldMatcher()
public void setFieldMatcher(TextMatcher fieldMatcher)
fieldMatcher
- field matcher@Deprecated public List<String> getSourceFields()
getFieldMatcher()
.@Deprecated public void setSourceFields(String... sourceFields)
setFieldMatcher(TextMatcher)
.sourceFields
- fields to use for checksum@Deprecated public void setSourceFields(List<String> sourceFields)
setFieldMatcher(TextMatcher)
.sourceFields
- fields to use for checksum@Deprecated public String getSourceFieldsRegex()
getFieldMatcher()
.@Deprecated public void setSourceFieldsRegex(String sourceFieldsRegex)
setFieldMatcher(TextMatcher)
.sourceFieldsRegex
- regular expression@Deprecated public boolean isDisabled()
null
effectively disables
it.false
@Deprecated public void setDisabled(boolean disabled)
null
effectively disable
it.disabled
- argument is ignoredprotected void loadChecksummerFromXML(XML xml)
loadChecksummerFromXML
in class AbstractMetadataChecksummer
protected void saveChecksummerToXML(XML xml)
saveChecksummerToXML
in class AbstractMetadataChecksummer
public boolean equals(Object other)
equals
in class AbstractMetadataChecksummer
public int hashCode()
hashCode
in class AbstractMetadataChecksummer
public String toString()
toString
in class AbstractMetadataChecksummer
Copyright © 2014–2023 Norconex Inc.. All rights reserved.