Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

online replication checksum #1097

Open
csuzhangxc opened this issue Sep 24, 2020 · 16 comments
Open

online replication checksum #1097

csuzhangxc opened this issue Sep 24, 2020 · 16 comments
Labels
help wanted This issue wanted some help from contributor type/feature-request This issue is a feature request

Comments

@csuzhangxc
Copy link
Member

Feature Request

Is your feature request related to a problem? Please describe:

now, we use syncer-diff-inspector to check data between upstream MySQL and downstream TiDB, but it can only check data which will not be updated during the checking process (these may be the whole MySQL instances, a database, a table, or part of data in a table with range specified in sync-diff-inspector config)

another similar issue is #688.

Describe the feature you'd like:

provides a complete online replication checksum feature.

NOTE: no extra data need to be written must be better.

Describe alternatives you've considered:

  • Data checksum: source and target database data comparison
  • Log checksum: calculating the checksum of the log to ensure that each log is applied regardless of the application result, suitable for log backup and distribution
  • Separately implement data checksum and log replication program
    • implement data checksum in the data validation program
    • implement log checksum in log replication program
  • data checksum
    • Divide chunks for checksum
    • Control the speed of checksum
    • do checksum
      • write help-data in the source database
      • lock chunk in the source database, or do multiple times checksum to avoid locking chunks
      • cooperate with data replication program to achieve incremental data checksum
    • Data checksum program can run independently, similar to sync-diff-inspector
    • De#terface to adapt to multiple data sources databases
  • log checksum
    • Divide chunks according to log position
    • calculate raw log checksum in a separate process
      • MySQL binlog
        • binlog event checksum
        • binlog event count
      • TiCDC/TiKV changed log
        • how to do it?
  • addition feature
    • specifies the maximum number of records that can fail validation before validation is suspended for the task
    • Specifies the delay before reporting any validation failure details.
    • show suspended records - some records in the table can't be validated, for example, if a record at the source is constantly being updated, we can't compare the source and the target
    • incremental data validation - identify on-going change logs and validate the changelogs row by row on the source and target database

Teachability, Documentation, Adoption, Migration Strategy:

@csuzhangxc csuzhangxc added type/feature-request This issue is a feature request help wanted This issue wanted some help from contributor labels Sep 24, 2020
@tirsen
Copy link

tirsen commented Sep 29, 2020

I think the best way to implement this is to use FLUSH TABLES WITH READ LOCK on the source mysql database and then stop replication at the correct GTID in the DM binlog player.

It's a bit fiddly but you can use a process that goes something like this:

  1. Stop the mysql->tidb binlog player
  2. Run FLUSH TABLES WITH READ LOCK on the source mysql and make a note of the GTID
  3. Create N connections (each worker thread needs its own connection) and start a transaction in each
  4. UNLOCK TABLES
  5. Restart the mysql->tidb binlog player until it reaches the GTID from step 2 above and make a note of the TiDB timestamp that this GTID resulted in
  6. Restart the mysql->tidb binlog player
  7. You can now run N parallel checksumming workers each using one of the connections created in 3 to diff against a TiDB connection synced to the TiDB timestamp from 5

This is what Vitess does to implement checksumming during a shard split.

@csuzhangxc
Copy link
Member Author

@tirsen 👍 we are planning to implement "stop replication at the correct GTID" (#348 is in our roadmap)

@csuzhangxc
Copy link
Member Author

@tirsen we think your above proposal is really good! and we only have a few details to append:

  • may need to use FLUSH TABLES tbl_name [, tbl_name] ... WITH READ LOCK instead of FLUSH TABLES WITH READ LOCK to only lock some tables which we are migrating, or even only flush & lock part of them to reduce the load on MySQL in each checking round
  • may need to consider the GC life time of TiDB and may abort the check with some timeout
  • when merging shard tables from multiple MySQL instances into one table in TiDB, we may need to stop the binlog player when reached the GTID for the first MySQL instance and re-start the binlog player until reached the GTID for the last MySQL instance, so than data in TiDB match all shards in MySQL
  • may need to abort the check if have a large replication lag between TiDB and MySQL
  • may need to abort the check if acquiring FTWRL takes too long
  • may still need to support only check part of the data in a table with update_time or any similar things

@tirsen
Copy link

tirsen commented Sep 30, 2020

@tirsen we think your above proposal is really good! and we only have a few details to append:

  • may need to use FLUSH TABLES tbl_name [, tbl_name] ... WITH READ LOCK instead of FLUSH TABLES WITH READ LOCK to only lock some tables which we are migrating, or even only flush & lock part of them to reduce the load on MySQL in each checking round

Ah yes good point.

Btw you will also need to have an option to use LOCK TABLES ... WITH READ LOCK because Aurora doesn't support FTWRL.

  • may need to consider the GC life time of TiDB and may abort the check with some timeout

Yeah, ideally we would increase the GC life time if necessary.

  • when merging shard tables from multiple MySQL instances into one table in TiDB, we may need to stop the binlog player when reached the GTID for the first MySQL instance and re-start the binlog player until reached the GTID for the last MySQL instance, so than data in TiDB match all shards in MySQL

I think each shard needs to be checksummed separately? With a separate GTID/transaction timestamp, separate checksum process and separate checksumming connection.

  • may need to abort the check if have a large replication lag between TiDB and MySQL

Yes.

  • may need to abort the check if acquiring FTWRL takes too long

Yes. You need to run this during low traffic so that FTWRL completes for all tables. There will be a brief outage during FTWRL so that's not something you want during peak traffic.

  • may still need to support only check part of the data in a table with update_time or any similar things

Not sure what you mean here. :-) You mean that we can add a configurable WHERE updated_at > ? clause to the comparison to decrease the set of rows to checksum? Yeah that's a nice feature.

@csuzhangxc
Copy link
Member Author

csuzhangxc commented Sep 30, 2020

I think each shard needs to be checksummed separately? With a separate GTID/transaction timestamp, separate checksum process and separate checksumming connection.

when MySQL-1-GTID-A == TiDB-TSO-A, there are rows in TiDB like:

1, a  # from MySQL-1
2, b  # from MySQL-2
3, c  # from MySQL-1

but when MySQL-2-GTID-B == TiDB-TSO-B, there are rows in TiDB like:

1, a  # from MySQL-1
2, b  # from MySQL-2
3, c  # from MySQL-1
4, d  # from MySQL-1
5, e  # from MySQL-2

it's a bit hard to check data consistency both for MySQL-1-GTID-A and MySQL-2-GTID-B by checksum because there isn't a clear WHERE clause can match data from MySQL-1 or MySQL-2 in TiDB.

so may need to block replication for MySQL-1-GTID-A until MySQL-2-GTID-B with rows in TiDB like:

1, a  # from MySQL-1
2, b  # from MySQL-2
3, c  # from MySQL-1
# no 4, d here
5, e  # from MySQL-2

then we can calculate checksums for MySQL-1 and MySQL-2 separately and XOR (or other methods) them to compare the checksum calculated in TiDB.

@csuzhangxc
Copy link
Member Author

You mean that we can add a configurable WHERE updated_at > ? clause to the comparison to decrease the set of rows to checksum?

YES

@csuzhangxc
Copy link
Member Author

Btw you will also need to have an option to use LOCK TABLES ... WITH READ LOCK because Aurora doesn't support FTWRL.

It seems can't FLUSH TABLES WITH READ LOCK in Aurora (without SUPER privilege), but can FLUSH TABLES tbl_name [, tbl_name] ... WITH READ LOCK (with RELOAD privilege, no need SUPER) in Aurora

@IANTHEREAL
Copy link
Collaborator

IANTHEREAL commented Oct 9, 2020

Saw the above discussion on how to do data checksum. Using FTWRL/LTWRL get consistent snapshots, If we need to migrate data from multiple shard tables to one TiDB table, then multiple consistent snapshots are required.

This scheme is very good. But I have some concern about FTWRL, FTWRL has a relatively large impact on the database, and some DBAs I know dare not use this feature in the master database. This is just a concern, but if data comparisons are frequent, then we need to pay attention to this concern.

In addition, my understanding of real-time/online replication checksum is an incremental checksum, not a full data checksum mechanism, the closest way may only choose some chunks to do do the data checksum. How to choose chunks is a question worth considering.

Based on the above ideas, I propose an optimistic data checksum schema - assuming that a chunk will not be updated for a period of time (or later), then we can verify it multiple times.

  • If the verification is consistent, the verification passes
  • If the number of times exceeded is still inconsistent, then an error is reported and the data is inconsistent, and need to use other data checksum way, such as the scheme discussed above

In this way, we may be possible to save the need to lock and control the DM replication task

@csuzhangxc
Copy link
Member Author

Saw the above discussion on how to do data checksum. Using FTWRL/LTWRL get consistent snapshots, If we need to migrate data from multiple shard tables to one TiDB table, then multiple consistent snapshots are required.

This scheme is very good. But I have some concern about FTWRL, FTWRL has a relatively large impact on the database, and some DBAs I know dare not use this feature in the master database. This is just a concern, but if data comparisons are frequent, then we need to pay attention to this concern.

In addition, my understanding of real-time/online replication checksum is an incremental checksum, not a full data checksum mechanism, the closest way may only choose some chunks to do do the data checksum. How to choose chunks is a question worth considering.

Based on the above ideas, I propose an optimistic data checksum schema - assuming that a chunk will not be updated for a period of time (or later), then we can verify it multiple times.

  • If the verification is consistent, the verification passes
  • If the number of times exceeded is still inconsistent, then an error is reported and the data is inconsistent, and need to use other data checksum way, such as the scheme discussed above

In this way, we may be possible to save the need to lock and control the DM replication task

I love this two-level checksum proposal, what's your opinion? @tirsen

@lance6716
Copy link
Collaborator

If we don't lock tables, upstream and downstream may have some different data, but not all data are different. So if split them into chunks by some good WHERE clause, chunks that contains cold data only could pass verification without lock table in "incremental" checksum. And if switched to "lock table" checksum and still some chunks are not touched during binlog replication, these chunks could be skipped to shorten compare time.

@tirsen
Copy link

tirsen commented Oct 9, 2020

Yeah we only run these diffs against stand by replicas so FTWRL is not a problem.

Doesn't chunk checksumming require a special updated_at column maintained by the application? That makes it less generally useful...

@tirsen
Copy link

tirsen commented Oct 9, 2020

Oh I see you just check it a few times until it succeeds? Yeah that might work! That's really nice!

@IANTHEREAL
Copy link
Collaborator

@tirsen sorry for the late reply! Yep, the optimistic data checksum schema would try to check it a few times until it succeeds.

I have a project implementation question, do you want to implement it as a more general library, like make upstream data fetching, chunk splitting, data checksum &comparison, waiting for verification timing as interfaces?

@IANTHEREAL
Copy link
Collaborator

There are two ideas proposed above. If other people have no other new ideas, I think we can choose a plan. What do you think? @lance6716 @tirsen @csuzhangxc

@csuzhangxc
Copy link
Member Author

I think we need both of them for different scenarios, but we may choose to implement optimistic data checksum first.

@tirsen
Copy link

tirsen commented Jul 27, 2021

optimistic data checksum sounds like it would certainly work very well for us.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
help wanted This issue wanted some help from contributor type/feature-request This issue is a feature request
Projects
None yet
Development

No branches or pull requests

4 participants