Norconex Filesystem Collector

Flow Diagram

Do you sometimes wonder what your crawler is doing? Knowing more about the sequence of events taking place can help you better configure your crawling solution. The following flowcharts detail how each file encountered is processed. While they do not cover all available features, they should give you a better idea of what's going on.

The upper flowchart shows how paths are "prepared" before being queued for processing by the next available thread. The second flowchart shows what happens when a thread gets the next path from the queue and "processes" it.

Click on a shape to get related information and links to more documentation.

Queue Fetch document no Save document Pre-process document Accepted by document filters? yes Import document no New doc checksum? Post-process document Rejected yes Commit (add) Start no Successful acquisition? Successful acquisition? Delete? no Commit (delete) yes yes yes no yes Start Rejected Accepted by reference filters? yes Queue no Accepted by Importer? yes Extract paths from folders Fetch metadata Accepted by metadata filters? New meta checksum? yes no no no