When used with a Norconex Crawler,
you can use the following XML to configure
Neo4j as the <committer>
section of your
Norconex Crawler configuration.
<committer class="com.norconex.committer.neo4j.Neo4jCommitter"> <!-- Mandatory settings: --> <uri>...</uri> <user>...</user> <password>...</password> <authentType>...</authentType> <multiValuesJoiner>...</multiValuesJoiner> <!-- Other settings: --> <nodeTopology>[ONE_NODE|NO_CONTENT|SPLITTED]</nodeTopology> <primaryLabel>...</primaryLabel> <relationships> <relationship type="..." direction="[NONE|INCOMING|OUTGOING|BOTH]"> <sourcePropertyKey>...</sourcePropertyKey> <targetPropertyKey>...</targetPropertyKey> </relationship> </relationships> <additionalLabels> <sourceField keep="[false|true]">...</sourceField> </additionalLabels> <!-- Use the following if password is encrypted. --> <passwordKey>...</passwordKey> <passwordKeySource>[key|file|environment|property]</passwordKeySource> <sourceReferenceField keep="[false|true]">...</sourceReferenceField> <targetReferenceField>...</targetReferenceField> <sourceContentField keep="[false|true]">...</sourceContentField> <targetContentField>...</targetContentField> <queueDir>...</queueDir> <queueSize>...</queueSize> <commitBatchSize>...</commitBatchSize> <maxRetries>...</maxRetries> <maxRetryWait>...</maxRetryWait> </committer>
Tag descriptions:
Tag | Description | ||||||
---|---|---|---|---|---|---|---|
uri | Connection Uri. E.g., "bolt://localhost:7687". | ||||||
user | The Neo4j username. | ||||||
password | The Neo4j password. | ||||||
passwordKey | Optional password key if password is encrypted. Refer to the API Documentation for more details. | ||||||
passwordKeySource |
Optional password encryption key source.
One of key , file , environment ,
or property .
Refer to the
API Documentation for more details.
|
||||||
authentType | Only BASIC is supported for now. | ||||||
multiValuesJoiner | One or more characters to join multi-value fields. Default is "|". | ||||||
nodeTopology |
The structure of a node for a committed document. Possible values:
|
||||||
primaryLabel | Primary label name used for all created nodes. | ||||||
additionalLabels | It is possible to add other labels on a newly created node. To do that,
specify one or more metadata fields using sourceField
elements. |
||||||
relationships |
Relationships is where you define relationships
between nodes. If a source field/property or target field/property
does not exist, it will be created automatically.
Possible values for the "direction" attribute are: NONE ,
INCOMING , OUTGOING , or BOTH .
The "type" attribute is an identifier/name for your relationship.
|
||||||
sourceReferenceField | Name of source field that will be mapped to the Neo4j target id field.
Default is the document reference the Committer stores as
committer.reference . Once re-mapped, the metadata source
field is deleted, unless keep is set to true. |
||||||
targetReferenceField | Name of target id field. Default is id .
Typically is a tableName primary key. |
||||||
sourceContentField | Source field name containing a document content/body. Default is not a
field, but rather the document body content. Once re-mapped, the
metadata source field is deleted, unless keep
is set to true . |
||||||
targetContentField | Neo4j target field name for a document content/body. Default is: content. | ||||||
queueDir | Path where to queue files before sending them to Neo4j. Default is: ./committer-queue | ||||||
queueSize | Number of documents or deletes to queue before sending to Neo4j. Default is: 1000. | ||||||
commitBatchSize | Maximum number of documents to send Neo4j at once. Default is: 100. | ||||||
maxRetries | Maximum number of retries upon commit failures. Default is: 0 (no retry). | ||||||
maxRetryWait | Delay between retries. Default is: 0 (no delay). |