Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

sync-diff-inspector: auto position tidb_snapshot from last syncpoint #663

Closed
morgo opened this issue Jul 20, 2022 · 0 comments · Fixed by #664
Closed

sync-diff-inspector: auto position tidb_snapshot from last syncpoint #663

morgo opened this issue Jul 20, 2022 · 0 comments · Fixed by #664
Assignees

Comments

@morgo
Copy link
Contributor

morgo commented Jul 20, 2022

Feature Request

Is your feature request related to a problem? Please describe:

Currently sync-diff-inspector does not support checking streams that are actively being read/written from. This is a problem, since it can't be used for TiCDC consistency checking. There is a FR to do online checksumming here: pingcap/dm#1097

However, if the source and destination are TiDB this is not the best way to do this. Instead, syncpoint (experimental) can be used. The source and destination can be set to use a tidb_snapshot to consistently compare.

Describe the feature you'd like:

My feature request is to "auto position" the snapshot which is used to compare, based on reading the most recently entry from the syncpoint table.

For example, given the following config:

######################### Databases config #########################
[data-sources]
[data-sources.tidb1]
    host = "127.0.0.1"
    port = 4000
    user = "root"
    password = ""
    snapshot = "auto"

[data-sources.tidb2]
    host = "192.168.86.38"
    port = 4000
    user = "root"
    password = ""
    snapshot = "auto"

######################### Task config #########################
# Required
[task]
    # 1 fix sql: fix-target-TIDB1.sql
    # 2 log: sync-diff.log
    # 3 summary: summary.txt
    # 4 checkpoint: a dir
    output-dir = "/tmp/output/config"

    source-instances = ["tidb1"]

    target-instance = "tidb2"

    # tables need to check. *Include `schema` and `table`. Use `.` to split*
    target-check-tables = ["test.stock"]

    # extra table config
    target-configs= ["config1"]

sync-diff-inspector will automatically run the following query on target-instance and apply primary_ts to source and secondary_ts to target.

tidb> select primary_ts, secondary_ts  from tidb_cdc.syncpoint_v1 order by primary_ts desc limit 1;
+--------------------+--------------------+
| primary_ts         | secondary_ts       |
+--------------------+--------------------+
| 434722879294406656 | 434723226951352321 |
+--------------------+--------------------+
1 row in set (0.00 sec)

This will save a lot of config file editing.

Describe alternatives you've considered:

The alternative is to generate config files each time to use for compare. But it's a pretty straight forward use case so I'm hoping support can be built in.

Teachability, Documentation, Adoption, Migration Strategy:

@morgo morgo changed the title sync-diff-inspector: auto position tidb_snapshot from last syncer point sync-diff-inspector: auto position tidb_snapshot from last syncpoint Jul 20, 2022
@morgo morgo self-assigned this Jul 20, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant