When used with a Norconex Crawler,
you can use the following XML to configure
Azure Search as the <committer>
section of your
Norconex Crawler configuration:
<committer class="com.norconex.committer.azuresearch.AzureSearchCommitter"> <endpoint>...</endpoint> <apiVersion>...</apiVersion> <apiKey>...</apiKey> <indexName>...</indexName> <disableReferenceEncoding>[false|true]</disableReferenceEncoding> <ignoreValidationErrors>[false|true]</ignoreValidationErrors> <ignoreResponseErrors>[false|true]</ignoreResponseErrors> <useWindowsAuth>[false|true]</useWindowsAuth> <proxyHost>...</proxyHost> <proxyPort>...</proxyPort> <proxyRealm>...</proxyRealm> <proxyScheme>...</proxyScheme> <proxyUsername>...</proxyUsername> <proxyPassword>...</proxyPassword> <proxyPasswordKey>...</proxyPasswordKey> <proxyPasswordKeySource>[key|file|environment|property]</proxyPasswordKeySource> <sourceReferenceField keep="[false|true]">...</sourceReferenceField> <targetReferenceField>...</targetReferenceField> <sourceContentField keep="[false|true]">...</sourceContentField> <targetContentField>...</targetContentField> <queueDir>...</queueDir> <queueSize>...</queueSize> <commitBatchSize>...</commitBatchSize> <maxRetries>...</maxRetries> <maxRetryWait>...</maxRetryWait> </committer>
Tag descriptions:
Tag | Description |
---|---|
endpoint | Azure Search endpoint (https://[service name].search.windows.net). |
indexName | Index name to use when committing documents to Azure Search. |
apiKey | Azure Search API admin key. |
apiVersion | Optional Azure Search API version to use. |
disableReferenceEncoding |
Disable URL-safe Base64 encoding of document references.
Default is false .
|
ignoreValidationErrors |
Ignoring validation errors will log errors detected by the committer
instead of throwing exceptions.
Default is false .
|
ignoreResponseErrors |
Ignoring response errors will log errors returned by Azure Search
instead of throwing exceptions.
Default is false .
|
useWindowsAuth |
Whether to use Windows Authentication. Default is false .
|
proxyHost | Optional proxy host. |
proxyPort | Optional proxy port. |
proxyRealm | Optional proxy realm. |
proxyScheme | Optional proxy scheme. |
proxyUsername | Optional proxy username. |
proxyPassword | Optional proxy password. |
proxyPasswordKey | Optional proxy password key if password is encrypted. Refer to the API Documentation for more details. |
proxyPasswordKeySource |
Optional password encryption key source.
One of key , file , environment ,
or property .
Refer to the
API Documentation for more details.
|
sourceReferenceField |
Name of source field that will be mapped to the Azure Search id field.
Default is the document reference the Committer stores as
document.reference . The metadata source field is deleted,
unless keep is set to true .
|
targetReferenceField | Name of target id field. Default is id . |
sourceContentField |
Source field name for a document content/body. Default is not a field,
but rather the document body content. Once re-mapped, the metadata
source field is deleted, unless keep is set to
true .
|
targetContentField |
Target field name for a document content/body. Default is:
content .
|
queueDir |
Optional path where to queue files before sending them to Azure Search.
Default is: ./committer-queue .
|
queueSize |
Optional maximum queue size before sending document to Azure Search.
Default is: 1000 .
|
commitBatchSize |
Optional maximum of documents to send to Azure Search at once.
Default is: 100 . Maximum is 1000.
|
maxRetries | Maximum retries upon commit failures. Default is 0 (no retry). |
maxRetryWait | Maximum delay (millisecond) between retries. Default is 0 (no delay). |