Siddhi IO CDC
The siddhi-io-cdc extension is an extension to Siddhi that captures change data from databases such as MySQL, MS SQL, PostgreSQL, H2 and Oracle.
For information on Siddhi and it's features refer Siddhi Documentation.
- Versions 3.x and above with group id
- Versions 2.x and lower with group id
Latest API Docs
Latest API Docs is 2.0.14.
- cdc (Source)
The CDC source receives events when change events (i.e., INSERT, UPDATE, DELETE) are triggered for a database table. Events are received in the 'key-value' format.
There are two modes you could perform CDC: Listening mode and Polling mode.
In polling mode, the datasource is periodically polled for capturing the changes. The polling period can be configured.
In polling mode, you can only capture INSERT and UPDATE changes.
On listening mode, the Source will keep listening to the Change Log of the database and notify in case a change has taken place. Here, you are immediately notified about the change, compared to polling mode.
The key values of the map of a CDC change event are as follows.
For 'listening' mode:
For insert: Keys are specified as columns of the table.
For delete: Keys are followed by the specified table columns. This is achieved via 'before_'. e.g., specifying 'before_X' results in the key being added before the column named 'X'.
For update: Keys are followed followed by the specified table columns. This is achieved via 'before_'. e.g., specifying 'before_X' results in the key being added before the column named 'X'.
For 'polling' mode: Keys are specified as the columns of the table.In order to connect in to the database table for receive CDC events, url, username, password and driverClassName(in polling mode) can be provided in deployment.yaml file under the siddhi namespace as below,
yaml siddhi: extensions: - extension: name: 'cdc' namespace: 'source' properties: url: jdbc:sqlserver://localhost:1433;databaseName=CDC_DATA_STORE password: <password> username: <> driverClassName: com.microsoft.sqlserver.jdbc.SQLServerDriver
Preparations required for working with Oracle Databases in listening mode
Using the extension in Windows, Mac OSX and AIX are pretty straight forward inorder to achieve the required behaviour please follow the steps given below
- Download the compatible version of oracle instantclient for the database version from here and extract
- Extract and set the environment variable
LD_LIBRARY_PATHto the location of instantclient which was exstracted as shown below
export LD_LIBRARY_PATH=<path to the instant client location>
- Inside the instantclient folder which was download there are two jars
ojdbc<version>.jarconvert them to OSGi bundles using the tools which were provided in the
<distribution>/binfor converting the
ojdbc.jaruse the tool
spi-provider.sh|batand for the conversion of
xstreams.jaruse the jni-provider.sh as shown below(Note: this way of converting Xstreams jar is applicable only for Linux environments for other OSs this step is not required and converting it through the
jartobundle.shtool is enough)
./jni-provider.sh <input-jar> <destination> <comma seperated native library names>
once ojdbc and xstreams jars are converted to OSGi copy the generated jars to the
<distribution>/lib. Currently siddhi-io-cdc only supports the oracle database distributions 12 and above
Configurations for PostgreSQL
When using listening mode with PostgreSQL, following properties has to be configured accordingly to create the connection.
slot.name: (default value = debezium) in postgreSQL only one connection can be created from single slot, so to create multiple connection custom slot.name should be provided.
plugin.name: (default value = decoderbufs ) Logical decoding output plugin name which the database is configured with. Other supported values are pgoutput, decoderbufs, wal2json.
table.name: table name should be provided as <schema_name>.<table_name>. As an example, public.customer
See parameter: mode for supported databases and change events.
JDBC connector jar should be added to the runtime. Download the JDBC connector jar based on the database type that is being used.
For MySQL, use connector version 5.1.xx.
In addition to that, there are some prerequisites that need to be met based on the CDC mode used. Please find them below.
Default mode (Listening mode):
Currently MySQL, PostgreSQL and SQLServer are supported in Listening Mode. To capture the change events, databases have to be configured as shown below.
- MySQL - https://debezium.io/documentation/reference/connectors/mysql.html#setup-the-mysql-server
- PostgreSQL - https://debezium.io/docs/connectors/postgresql/#setting-up-PostgreSQL
- SQLServer - https://debezium.io/docs/connectors/sqlserver/#setting-up-sqlserver
- Change data capturing table should be have a polling column. Auto Incremental column or Timestamp can be used.
Please see API docs for more details about change data capturing modes.
For installing this extension on various siddhi execution environments refer Siddhi documentation section on adding extensions.
Running Integration tests in docker containers(Optional)
The CDC functionality are tested with the docker base integration test framework. The test framework initialize a docker container with required configuration before execute the test suit.
Start integration tests
Install and run docker
To run the integration tests, navigate to the siddhi-io-cdc/ directory and issue the following commands.
mvn clean install
mvn verify -P local-mysql -Dskip.surefire.test=true
mvn verify -P local-postgres -Dskip.surefire.test=true
mvn verify -P local-mssql -Dskip.surefire.test=true
mvn verify -P local-oracle -Dskip.surefire.test=true
Support and Contribution
We encourage users to ask questions and get support via StackOverflow, make sure to add the
siddhitag to the issue for better response.
If you find any issues related to the extension please report them on the issue tracker.
For production support and other contribution related information refer Siddhi Community documentation.