Skip to content

SynchDB 1.0

Latest
Compare
Choose a tag to compare
@caryhuang caryhuang released this 24 Dec 17:18
d79fde6

Release Date: December 24, 2024

We're thrilled to announce the release of SynchDB 1.0! This PostgreSQL extension is designed to seamlessly synchronize data from multiple heterogeneous databases, such as MySQL and MS SQL Server, directly to PostgreSQL. SynchDB manages all data synchronization without requiring middleware, making it an efficient, streamlined solution for real-time data replication and integration.

This release primarily resolves the performance and resource issues knwon to 1.0 beta1 release and added several new utilities to allow user to fine tune the behavior and performance of SynchDB

Added

  • added a data cache in DML parsing stage to prevent frequent access to PostgreSQL's catalog to obtain a table's tuple descriptor structure.
  • added a variant of synchdb_start_engine_bgw(name, mode) that takes a second argument to indicate a custom snapshot mode to start the connector with. More detail here.
  • added several new GUCs that can be adjusted to tune the performance of Debezium Runner. Refer to here for complete list.
  • added a debug SQL function synchdb_log_jvm_meminfo(name) that causes specified connector to output current JVM heap memory usage summary in PostgreSQL log file.
  • added a new VIEW synchdb_stats_view that prints statistic information for all connectors. More detail here.
  • added a new SQL function synchdb_reset_stats(name) to clear statistic information of specified connector. More detail here.
  • added a mess creation script to quickly generate test tables and data on MySQL database type.

Changed

  • synchdb_state_view(): added a new field called stage that indicates the current stage of a connector (value can either be Initial Snapshot or Change Data Capture).
  • synchdb_state_view(): will only show state of valid connectors.
  • removed sending "partial batch completion" notification to Debezium Runner in case of error, because a batch is now handled by one PostgreSQL transaction, and partial completion is not allowed.
  • SSL related parameters per connector can now be specified in the rule file.
  • the maximum heap memory to allocate to JVM that runs the Debezium Runner can now be configured via GUC.
  • max number of connector background worker is now configurable instead of hardcoded 30.

Fixed

  • fixed rapid memory buildups in Debezium runner in JVM by adding a throttle control in the receiving of change events.
  • resolved majority of memory leak in both SynchDB and Debezium runner components.
  • corrected the use of memory context in SynchDB such that heap memory can be correctly freed at the end of each change event processing.
  • significantly increased the processing speed of SynchDB by processing a batch within a single PostgreSQL transaction rather than multiple.
  • corrected SQLServer's default data type size mapping for char type from 0 to -1.
  • resolved high memory usage during DML processing using SPI.

Known Issues

  • Automatic connector launcher only launches connector workers created under the default postgres database (#71)
  • ALTER TABLE ALTER COLUMN does not handle: (#77)
    • Complex data type changes (e.g., from TEXT -> INT)
    • Column index changes
    • Renamed columns
  • Restarting a paused connector will cause it to resume when restarted, rather than starting it in paused state (#80)