public interface IDocumentChecksummer
Creates a checksum representing a a document. Checksums are used to quickly filter out documents that have already been processed or that have changed since a previous run.
Two or more Doc
can hold different values, but
be deemed logically the same.
Such documents do not have to be equal, but they should return the
same checksum. An example of
this can be two different URLs pointing to the same document, where only a
single instance should be kept.
There are no strict rules that define what is equivalent or not.
AbstractDocumentChecksummer
Modifier and Type | Method and Description |
---|---|
String |
createDocumentChecksum(Doc document)
Creates a document checksum.
|
Copyright © 2014–2023 Norconex Inc.. All rights reserved.