SynchDB is a PostgreSQL extension designed for fast and reliable data replication from multiple heterogeneous databases (including MySQL, SQL Server, and Oracle) into PostgreSQL. It eliminates the need for middleware or third-party software, enabling direct synchronization without additional orchestration.
SynchDB is a dual-language module, combining Java (for utilizing Debezium Embedded connectors) and C (for interacting with PostgreSQL core) components, and requires the Java Virtual Machine (JVM) and Java Native Interface (JNI) to operate together.
Visit SynchDB documentation site here for more design details.
SynchDB extension consists of six major components:
- Debezium Runner Engine (Java)
- SynchDB Launcher
- SynchDB Worker
- Format Converter
- Replication Agent
- Table Synch Agent (TBD)
- A java application that drives Debezium Embedded engine and utilizes all of the database connectors that it supports.
- Invoked by
SynchDB Worker
to initialize Debezium embedded library and receive change data. - Send the change data to
SynchDB Worker
in generalized JSON format for further processing.
- Responsible for creating and destroying SynchDB workers using PostgreSQL's background worker APIs.
- Configure each worker's connector type, destination database IPs, ports, etc.
- Instantiats a
Debezium Runner Engine
inside a Java Virtual Machine (JVM) to replicate changes from a specific connector type. - Communicate with Debezium Runner via JNI to receive change data in JSON formats.
- Transfer the JSON change data to
Format Converter
module for further processing.
- Parse the JSON change data using PostgreSQL Jsonb APIs
- Transform DDL queries to PostgreSQL compatible queries and translate heterogeneous column data type to PostgreSQL compatible types.
- Transform DML queries to PostgreSQL compatible queries and handle data processing based on column data types
- Produces raw HeapTupleData which can be fed directly to Heap Access Method within PostgreSQL for faster executions.
- Processes the outputs from
Format Converter
. Format Converter
will produce HeapTupleData format outputs, thenReplication Agent
will invoke PostgreSQL's heap access method routines to handle them.
- Design details and implementation are not available yet. TBD
- Intended to provide a more efficient alternative to perform initial table synchronization.
The following software is required to build and run SynchDB. The versions listed are the versions tested during development. Older versions may still work.
- Java Development Kit 17 or later. Download here
- Apache Maven 3.6.3 or later. Download here
- PostgreSQL 16.3 Source. Git clone here. Refer to this wiki for PostgreSQL build requirements
- Docker compose 2.28.1 (for testing). Refer to here
- Unix based operating system like Ubuntu 22.04 or MacOS
Clone the PostgreSQL source and switch to 16.3 release tag
git clone https://github.com/postgres/postgres.git --branch REL_16_3
cd postgres
Clone the SynchDB source from within the extension folder
cd contrib/
git clone https://github.com/Hornetlabs/synchdb.git
If you are working on Ubuntu 22.04.4 LTS, install the Maven as below:
sudo apt install maven
if you are using MacOS, you can use the brew command to install maven (refer (here)[https://brew.sh/] for how to install Homebrew) without any other settings:
brew install maven
If you are working on Ubuntu 22.04.4 LTS, install the OpenJDK as below:
sudo apt install openjdk-21-jdk
If you are working on MacOS, please install the JDK with brew command:
brew install openjdk@22
This can be done by following the standard build and install procedure as described here. Generally, it consists of:
cd /home/$USER/postgres
./configure
make
sudo make install
You should build and install the default extensions as well:
cd /home/$USER/postgres/contrib
make
sudo make install
The commands below build and install the Debezium Runner Engine jar file to your PostgreSQL's lib folder.
cd /home/$USER/postgres/contrib/synchdb
make build_dbz
sudo make install_dbz
The commands below build and install SynchDB extension to your PostgreSQL's lib and share folder.
cd /home/$USER/postgres/contrib/synchdb
make
sudo make install
Lastly, we also need to tell your system's linker where the newly added Java library (libjvm.so) is located in your system.
# Dynamically set JDK paths
JAVA_PATH=$(which java)
JDK_HOME_PATH=$(readlink -f ${JAVA_PATH} | sed 's:/bin/java::')
JDK_LIB_PATH=${JDK_HOME_PATH}/lib
echo $JDK_LIB_PATH
echo $JDK_LIB_PATH/server
sudo echo "$JDK_LIB_PATH" | sudo tee -a /etc/ld.so.conf.d/x86_64-linux-gnu.conf
sudo echo "$JDK_LIB_PATH/server" | sudo tee -a /etc/ld.so.conf.d/x86_64-linux-gnu.conf
Note, for mac with M1/M2 chips, you need to the two lines into /etc/ld.so.conf.d/aarch64-linux-gnu.conf
Run ldconfig to reload:
sudo ldconfig
Ensure synchdo.so extension can link to libjvm Java library on your system:
ldd synchdb.so
linux-vdso.so.1 (0x00007ffeae35a000)
libjvm.so => /usr/lib/jdk-22.0.1/lib/server/libjvm.so (0x00007fc1276c1000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc127498000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc127493000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc12748e000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fc127489000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc1273a0000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc128b81000)
Refer to testenv/README.md
or visit our documentation page here to prepare a sample MySQL, SQL Server or Oracle database instances for testing.
SynchDB extension requires pgcrypto to encrypt certain sensitive credential data. Please make sure it is installed prior to installing SynchDB. Alternatively, you can include CASCADE
clause in CREATE EXTENSION
to automatically install dependencies:
CREATE EXTENSION synchdb CASCADE;
A connector represents the details to connecto to a remote heterogeneous database and describes what tables to replicate from. It can be created with synchdb_add_conninfo()
function.
Create a MySQL connector and replicate inventory.orders
and inventory.customers
tables under invnetory
database:
SELECT synchdb_add_conninfo('mysqlconn','127.0.0.1', 3306, 'mysqluser', 'mysqlpwd', 'inventory', 'postgres', 'inventory.orders,inventory.customers', 'mysql');
Create a SQL Server connector and replicate from all tables under testDB
database.
SELECT synchdb_add_conninfo('sqlserverconn','127.0.0.1', 1433, 'sa', 'Password!', 'testDB', 'postgres', '', 'sqlserver');
Create a Oracle connector and replicate from all tables under mydb
database.
SELECT synchdb_add_conninfo('oracleconn','127.0.0.1', 1521, 'c##dbzuser', 'dbz', 'mydb', 'postgres', '', 'oracle');
All created connectors are stored in the synchdb_conninfo
table. Please note that user passwords are encrypted by SynchDB on creation so please do not modify the password field directly.
postgres=# \x
Expanded display is on.
postgres=# select * from synchdb_conninfo;
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | oracleconn
isactive | f
data | {"pwd": "\\xc30d04070302e3baf1293d0d553066d234014f6fc52e6eea425884b1f65f1955bf504b85062dfe538ca2e22bfd6db9916662406fc45a3a530b7bf43ce4cfaa2b049a1c9af8", "port": 1521, "user": "c##dbzuser", "dstdb": "postgres", "srcdb": "mydb", "table": "null", "hostname": "127.0.0.1", "connector": "oracle"}
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | sqlserverconn
isactive | t
data | {"pwd": "\\xc30d0407030245ca4a983b6304c079d23a0191c6dabc1683e4f66fc538db65b9ab2788257762438961f8201e6bcefafa60460fbf441e55d844e7f27b31745f04e7251c0123a159540676c4", "port": 1433, "user": "sa", "dstdb": "postgres", "srcdb": "testDB", "table": "null", "hostname": "127.0.0.1", "connector": "sqlserver"}
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | mysqlconn
isactive | t
data | {"pwd": "\\xc30d04070302986aff858065e96b62d23901b418a1f0bfdf874ea9143ec096cd648a1588090ee840de58fb6ba5a04c6430d8fe7f7d466b70a930597d48b8d31e736e77032cb34c86354e", "port": 3306, "user": "mysqluser", "dstdb": "postgres", "srcdb": "inventory", "table": "inventory.orders,inventory.customers", "hostname": "127.0.0.1", "connector": "mysql"}
A connector can be started by synchdb_start_engine_bgw()
function. It takes one connector name argument which must have been created by synchdb_add_conninfo()
prior to starting. The commands below starts a connector in default snapshot mode, which replicates designated table schemas, their initial data and proceed to stream live changes (CDC).
select synchdb_start_engine_bgw('mysqlconn');
select synchdb_start_engine_bgw('oracleconn');
select synchdb_start_engine_bgw('sqlserverconn');
Use synchdb_state_view()
to examine all connectors' running states.
postgres=# select * from synchdb_state_view;
name | connector_type | pid | stage | state | err | last_dbz_offset
---------------+----------------+--------+---------------------+---------+----------+------------------------------------------------------------------------------------------------------
sqlserverconn | sqlserver | 579820 | change data capture | polling | no error | {"commit_lsn":"0000006a:00006608:0003","snapshot":true,"snapshot_completed":false}
mysqlconn | mysql | 579845 | change data capture | polling | no error | {"ts_sec":1741301103,"file":"mysql-bin.000009","pos":574318212,"row":1,"server_id":223344,"event":2}
oracleconn | oracle | 580053 | change data capture | polling | no error | offset file not flushed yet
(3 rows)
Use synchdb_stop_engine_bgw()
SQL function to stop a connector.It takes one connector name argument which must have been created by synchdb_add_conninfo()
function.
select synchdb_stop_engine_bgw('mysqlconn');
select synchdb_stop_engine_bgw('sqlserverconn');
select synchdb_stop_engine_bgw('oracleconn');