public interface IMetadataChecksummer
Creates a checksum representing a document based on document metadata values obtained prior to fetching that document (e.g. HTTP header values form an HTTP HEAD call, file properties, etc.). Checksums are used to quickly filter out documents that have already been processed or that have changed since a previous run.
Two or more Doc
can hold different values, but
be deemed logically the same.
Such documents do not have to be equal, but they should return the
same checksum. An example of
this can be two different URLs pointing to the same document, where only a
single instance should be kept.
There are no strict rules that define what is equivalent or not.
AbstractMetadataChecksummer
Modifier and Type | Method and Description |
---|---|
String |
createMetadataChecksum(Properties metadata)
Creates a metadata checksum.
|
String createMetadataChecksum(Properties metadata)
metadata
- all metadata valuesCopyright © 2014–2023 Norconex Inc.. All rights reserved.