Skip to content

Releases: Hornetlabs/synchdb

SynchDB 1.0

24 Dec 17:18
d79fde6
Compare
Choose a tag to compare

Release Date: December 24, 2024

We're thrilled to announce the release of SynchDB 1.0! This PostgreSQL extension is designed to seamlessly synchronize data from multiple heterogeneous databases, such as MySQL and MS SQL Server, directly to PostgreSQL. SynchDB manages all data synchronization without requiring middleware, making it an efficient, streamlined solution for real-time data replication and integration.

This release primarily resolves the performance and resource issues knwon to 1.0 beta1 release and added several new utilities to allow user to fine tune the behavior and performance of SynchDB

Added

  • added a data cache in DML parsing stage to prevent frequent access to PostgreSQL's catalog to obtain a table's tuple descriptor structure.
  • added a variant of synchdb_start_engine_bgw(name, mode) that takes a second argument to indicate a custom snapshot mode to start the connector with. More detail here.
  • added several new GUCs that can be adjusted to tune the performance of Debezium Runner. Refer to here for complete list.
  • added a debug SQL function synchdb_log_jvm_meminfo(name) that causes specified connector to output current JVM heap memory usage summary in PostgreSQL log file.
  • added a new VIEW synchdb_stats_view that prints statistic information for all connectors. More detail here.
  • added a new SQL function synchdb_reset_stats(name) to clear statistic information of specified connector. More detail here.
  • added a mess creation script to quickly generate test tables and data on MySQL database type.

Changed

  • synchdb_state_view(): added a new field called stage that indicates the current stage of a connector (value can either be Initial Snapshot or Change Data Capture).
  • synchdb_state_view(): will only show state of valid connectors.
  • removed sending "partial batch completion" notification to Debezium Runner in case of error, because a batch is now handled by one PostgreSQL transaction, and partial completion is not allowed.
  • SSL related parameters per connector can now be specified in the rule file.
  • the maximum heap memory to allocate to JVM that runs the Debezium Runner can now be configured via GUC.
  • max number of connector background worker is now configurable instead of hardcoded 30.

Fixed

  • fixed rapid memory buildups in Debezium runner in JVM by adding a throttle control in the receiving of change events.
  • resolved majority of memory leak in both SynchDB and Debezium runner components.
  • corrected the use of memory context in SynchDB such that heap memory can be correctly freed at the end of each change event processing.
  • significantly increased the processing speed of SynchDB by processing a batch within a single PostgreSQL transaction rather than multiple.
  • corrected SQLServer's default data type size mapping for char type from 0 to -1.
  • resolved high memory usage during DML processing using SPI.

Known Issues

  • Automatic connector launcher only launches connector workers created under the default postgres database (#71)
  • ALTER TABLE ALTER COLUMN does not handle: (#77)
    • Complex data type changes (e.g., from TEXT -> INT)
    • Column index changes
    • Renamed columns
  • Restarting a paused connector will cause it to resume when restarted, rather than starting it in paused state (#80)

SynchDB 1.0 Beta1

25 Oct 22:08
8a9af39
Compare
Choose a tag to compare
SynchDB 1.0 Beta1 Pre-release
Pre-release

Release Date: October 23, 2024

We're thrilled to announce the beta release of SynchDB 1.0! This PostgreSQL extension is designed to seamlessly synchronize data from multiple heterogeneous databases, such as MySQL and MS SQL Server, directly to PostgreSQL. SynchDB manages all data synchronization without requiring middleware, making it an efficient, streamlined solution for real-time data replication and integration.

Requirements

  • PostgreSQL 16
  • Java Runtime Environment 17 or later

Core Features

  • Supports logical replication from heterogeneous databases (MySQL and SQLServer)
  • Supports DDL replication
    • CREATE TABLE
    • DROP TABLE
    • ALTER TABLE ADD COLUMN
    • ALTER TABLE DROP COLUMN
    • ALTER TABLE ALTER COLUMN
  • Supports DML replication (INSERT, UPDATE, DELETE)
  • Supports max 30 concurrent connector workers
  • Supports automatic connector launcher at PostgreSQL startup
  • Supports global connector state and last error message views
  • Supports selective databases and tables replication
  • Supports change events in batches
  • Supports connector restarts in different snapshot modes
  • Supports offset management interfaces to select custom replication resume point
  • Supports default data type and object name transform rules for supported heterogeneous databases
  • Supports JSON rule file to define custom:
    • Data type transform rules
    • Column name transform rules
    • Table name transform rules
    • Data expression transform rules
  • Supports 2 data apply modes (SPI, HeapAM API)
  • Supports several utility functions to perform connector operations:
    • start
    • stop
    • pause
    • resume

Known Issues

  • Automatic connector launcher only launches connector workers created under the default postgres database (#71)
  • ALTER TABLE ALTER COLUMN does not handle:
    • Complex data type changes (e.g., from TEXT -> INT)
    • Column index changes
    • Renamed columns
      (#77)
  • Cannot specify X509 certificate and private key to connect to heterogeneous databases via TLS (#78)
  • The last_dbz_offset column from synchdb_state_view() does not reflect the current data offset, but rather the last offset value flushed to disk (#79)
  • Restarting a paused connector will cause it to resume when restarted, rather than starting it in paused state (#80)
  • memory leak issue in synchdb
  • java heap memory in JVM may run out if the source tables are too big - need throttle control there