When used with a Norconex Crawler,
you can use the following XML to configure
Azure Search as the <committer> section of your
Norconex Crawler configuration:
<committer class="com.norconex.committer.azuresearch.AzureSearchCommitter"> <endpoint>...</endpoint> <apiVersion>...</apiVersion> <apiKey>...</apiKey> <indexName>...</indexName> <disableReferenceEncoding>[false|true]</disableReferenceEncoding> <ignoreValidationErrors>[false|true]</ignoreValidationErrors> <ignoreResponseErrors>[false|true]</ignoreResponseErrors> <useWindowsAuth>[false|true]</useWindowsAuth> <proxyHost>...</proxyHost> <proxyPort>...</proxyPort> <proxyRealm>...</proxyRealm> <proxyScheme>...</proxyScheme> <proxyUsername>...</proxyUsername> <proxyPassword>...</proxyPassword> <proxyPasswordKey>...</proxyPasswordKey> <proxyPasswordKeySource>[key|file|environment|property]</proxyPasswordKeySource> <sourceReferenceField keep="[false|true]">...</sourceReferenceField> <targetReferenceField>...</targetReferenceField> <sourceContentField keep="[false|true]">...</sourceContentField> <targetContentField>...</targetContentField> <queueDir>...</queueDir> <queueSize>...</queueSize> <commitBatchSize>...</commitBatchSize> <maxRetries>...</maxRetries> <maxRetryWait>...</maxRetryWait> </committer>
Tag descriptions:
| Tag | Description |
|---|---|
| endpoint | Azure Search endpoint (https://[service name].search.windows.net). |
| indexName | Index name to use when committing documents to Azure Search. |
| apiKey | Azure Search API admin key. |
| apiVersion | Optional Azure Search API version to use. |
| disableReferenceEncoding |
Disable URL-safe Base64 encoding of document references.
Default is false.
|
| ignoreValidationErrors |
Ignoring validation errors will log errors detected by the committer
instead of throwing exceptions.
Default is false.
|
| ignoreResponseErrors |
Ignoring response errors will log errors returned by Azure Search
instead of throwing exceptions.
Default is false.
|
| useWindowsAuth |
Whether to use Windows Authentication. Default is false.
|
| proxyHost | Optional proxy host. |
| proxyPort | Optional proxy port. |
| proxyRealm | Optional proxy realm. |
| proxyScheme | Optional proxy scheme. |
| proxyUsername | Optional proxy username. |
| proxyPassword | Optional proxy password. |
| proxyPasswordKey | Optional proxy password key if password is encrypted. Refer to the API Documentation for more details. |
| proxyPasswordKeySource |
Optional password encryption key source.
One of key, file, environment,
or property.
Refer to the
API Documentation for more details.
|
| sourceReferenceField |
Name of source field that will be mapped to the Azure Search id field.
Default is the document reference the Committer stores as
document.reference. The metadata source field is deleted,
unless keep is set to true.
|
| targetReferenceField | Name of target id field. Default is id. |
| sourceContentField |
Source field name for a document content/body. Default is not a field,
but rather the document body content. Once re-mapped, the metadata
source field is deleted, unless keep is set to
true.
|
| targetContentField |
Target field name for a document content/body. Default is:
content.
|
| queueDir |
Optional path where to queue files before sending them to Azure Search.
Default is: ./committer-queue.
|
| queueSize |
Optional maximum queue size before sending document to Azure Search.
Default is: 1000.
|
| commitBatchSize |
Optional maximum of documents to send to Azure Search at once.
Default is: 100. Maximum is 1000.
|
| maxRetries | Maximum retries upon commit failures. Default is 0 (no retry). |
| maxRetryWait | Maximum delay (millisecond) between retries. Default is 0 (no delay). |