public class SQLCommitter extends AbstractBatchCommitter
Commit documents to an SQL table. Document metadata fields are mapped to table columns.
By default, this Committer will throw an exception when trying to insert values into non-existing database table or fields. It is recommended you make sure your database table exists and the document fields being sent to the committer match your database fields.
Alternatively, you can provide the necessary SQLs to create a new
table as well as new fields as needed using
SQLCommitterConfig.setCreateTableSQL(String)
and
SQLCommitterConfig.setCreateFieldSQL(String)
respectively. Make sure to use the following placeholder variables
as needed in the provided SQL(s) table and field creation, respectively:
SQLCommitterConfig.setTableName(String)
.
SQLCommitterConfig.setPrimaryKey(String)
.
Passwords can be encrypted using EncryptionUtil
(or
command-line "encrypt.bat" or "encrypt.sh" if those are available to you).
In order for the password to be decrypted properly, you need
to specify the encryption key used to encrypt it. The key can obtained
from a few supported locations. The combination of the password key
"value" and "source" is used to properly locate the key.
The supported sources are:
key |
The actual encryption key. |
file |
Path to a file containing the encryption key. |
environment |
Name of an environment variable containing the key. |
property |
Name of a JVM system property containing the key. |
Optionally apply a committer only to certain type of documents. Documents are restricted based on their metadata field names and values. This option can be used to perform document routing when you have multiple committers defined.
By default, this abstract class applies field mappings for metadata fields, but leaves the document reference and content (input stream) for concrete implementations to handle. In other words, they only apply to a committer request metadata. Field mappings are performed on committer requests before upserts and deletes are actually performed.
<committer
class="com.norconex.committer.sql.SQLCommitter">
<!-- Mandatory settings -->
<driverClass>(Class name of the JDBC driver to use.)</driverClass>
<connectionUrl>(JDBC connection URL.)</connectionUrl>
<tableName>
(The target database table name where documents will be committed.)
</tableName>
<primaryKey>
(The name of the table primary key field where the document
reference will be stored, unless it is already set by a field of
the same name in the source document. At a minimum, this "primaryKey"
field name should be "unique", ideally indexed.).
</primaryKey>
<!-- Other settings -->
<driverPath>
(Path to JDBC driver. Not required if already in classpath.)
</driverPath>
<properties>
<property
name="(property name)">
(Property value.)
</property>
<!-- You can have multiple property. -->
</properties>
<createTableSQL>
<!--
The CREATE statement used to create a table if it does not
already exist. If you need fields of specific data types,
specify them here. You can use the variable placeholders {tableName}
and {primaryKey} which will be replaced with the configuration option
of the same name. If you do not use those variables, make sure you use
the same names.
See usage sample.
-->
</createTableSQL>
<createFieldSQL>
<!--
The ALTER statement used to create missing table fields.
The {tableName} variable will be replaced with
the configuration option of the same name. The {fieldName}
variable will be replaced by newly encountered field names.
See usage sample.
-->
</createFieldSQL>
<multiValuesJoiner>
(One or more characters to join multi-value fields.
Default is "|".)
</multiValuesJoiner>
<fixFieldNames>
[false|true]
(Attempts to prevent insertion errors by converting characters that
are not underscores or alphanumeric to underscores.
Will also remove all non-alphabetic characters that prefixes
a field name.)
</fixFieldNames>
<fixFieldValues>
[false|true]
(Attempts to prevent insertion errors by truncating values
that are larger than their defined maximum field length.)
</fixFieldValues>
<!-- Use the following if authentication is required. -->
<credentials>
<username>(the username)</username>
<password>(the optionally encrypted password)</password>
<passwordKey>
<value>(The actual password encryption key or a reference to it.)</value>
<source>[key|file|environment|property]</source>
<size>(Size in bits of encryption key. Default is 128.)</size>
</passwordKey>
</credentials>
<targetContentField>
(Table field name where to store the document content stream.
Make it empty or a self-closed tag if you do not want to store the
document content. Since document content can sometimes be quite
large, a CLOB field is usually advised.
If there is already a document field with the same name, that
document field takes precedence and the content stream is ignored.
Default is "content".)
</targetContentField>
<!-- multiple "restrictTo" tags allowed (only one needs to match) -->
<restrictTo>
<fieldMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(field-matching expression)
</fieldMatcher>
<valueMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(value-matching expression)
</valueMatcher>
</restrictTo>
<fieldMappings>
<!-- Add as many field mappings as needed -->
<mapping
fromField="(source field name)"
toField="(target field name)"/>
</fieldMappings>
<!-- Settings for default queue implementation ("class" is optional): -->
<queue
class="com.norconex.committer.core3.batch.queue.impl.FSQueue">
<batchSize>
(Optional number of documents queued after which we process a batch.
Default is 20.)
</batchSize>
<maxPerFolder>
(Optional maximum number of files or directories that can be queued
in a single folder before a new one gets created. Default is 500.)
</maxPerFolder>
<commitLeftoversOnInit>
(Optionally force to commit any leftover documents from a previous
execution. E.g., prematurely ended. Default is "false").
</commitLeftoversOnInit>
<onCommitFailure>
<splitBatch>[OFF|HALF|ONE]</splitBatch>
<maxRetries>(Max retries upon commit failures. Default is 0.)</maxRetries>
<retryDelay>
(Delay in milliseconds between retries. Default is 0.)
</retryDelay>
<ignoreErrors>
[false|true]
(When true, non-critical exceptions when interacting with the target
repository won't be thrown to try continue the execution with other
files to be committed. Instead, errors will be logged.
In both cases the failing batch/files are moved to an
"error" folder. Other types of exceptions may still be thrown.)
</ignoreErrors>
</onCommitFailure>
</queue>
</committer>
XML configuration entries expecting millisecond durations
can be provided in human-readable format (English only), as per
DurationParser
(e.g., "5 minutes and 30 seconds" or "5m30s").
<committer
class="com.norconex.committer.sql.SQLCommitter">
<driverPath>/path/to/driver/h2.jar</driverPath>
<driverClass>org.h2.Driver</driverClass>
<connectionUrl>jdbc:h2:file:///path/to/db/h2</connectionUrl>
<tableName>test_table</tableName>
<createTableSQL>
CREATE TABLE {tableName} (
{primaryKey} VARCHAR(32672) NOT NULL,
content CLOB,
PRIMARY KEY ( {primaryKey} ),
title VARCHAR(256)
author VARCHAR(256)
)
</createTableSQL>
<createFieldSQL>
ALTER TABLE {tableName} ADD {fieldName} VARCHAR(5000)
</createFieldSQL>
<fixFieldValues>true</fixFieldValues>
</committer>
The above example uses an H2 database and creates the table and fields as they are encountered, storing all new fields as VARCHAR, making sure those new fields are no longer than 5000 characters.
Constructor and Description |
---|
SQLCommitter() |
SQLCommitter(SQLCommitterConfig config) |
Modifier and Type | Method and Description |
---|---|
protected void |
closeBatchCommitter() |
protected void |
commitBatch(Iterator<ICommitterRequest> it) |
boolean |
equals(Object other) |
SQLCommitterConfig |
getConfig() |
int |
hashCode() |
protected void |
initBatchCommitter() |
protected void |
loadBatchCommitterFromXML(XML xml) |
protected void |
saveBatchCommitterToXML(XML xml) |
String |
toString() |
consume, doClean, doClose, doDelete, doInit, doUpsert, getCommitterQueue, loadCommitterFromXML, saveCommitterToXML, setCommitterQueue
accept, addRestriction, addRestrictions, applyFieldMappings, clean, clearFieldMappings, clearRestrictions, close, delete, fireDebug, fireDebug, fireError, fireError, fireInfo, fireInfo, getCommitterContext, getFieldMappings, getRestrictions, init, loadFromXML, removeFieldMapping, removeRestriction, removeRestriction, saveToXML, setFieldMapping, setFieldMappings, upsert
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
loadFromXML, saveToXML
public SQLCommitter()
public SQLCommitter(SQLCommitterConfig config)
protected void initBatchCommitter() throws CommitterException
initBatchCommitter
in class AbstractBatchCommitter
CommitterException
protected void commitBatch(Iterator<ICommitterRequest> it) throws CommitterException
commitBatch
in class AbstractBatchCommitter
CommitterException
protected void closeBatchCommitter() throws CommitterException
closeBatchCommitter
in class AbstractBatchCommitter
CommitterException
public SQLCommitterConfig getConfig()
protected void loadBatchCommitterFromXML(XML xml)
loadBatchCommitterFromXML
in class AbstractBatchCommitter
protected void saveBatchCommitterToXML(XML xml)
saveBatchCommitterToXML
in class AbstractBatchCommitter
public boolean equals(Object other)
equals
in class AbstractBatchCommitter
public int hashCode()
hashCode
in class AbstractBatchCommitter
public String toString()
toString
in class AbstractBatchCommitter
Copyright © 2017–2022 Norconex Inc.. All rights reserved.