Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

bad index stats after table cut-over #875

Open
cenkore opened this issue Aug 17, 2020 · 3 comments
Open

bad index stats after table cut-over #875

cenkore opened this issue Aug 17, 2020 · 3 comments

Comments

@cenkore
Copy link

cenkore commented Aug 17, 2020

we are facing an issue lately that query plan maybe go wrong in a short time after alter a small table (180,000 rows).

After in-depth analysis, we found a _gho table cut-over before the statistics persisted, the new table initial opening would read persistent data in mysql.innodb_index_stats / mysql.innodb_table_stats, but there is no statistics at this time , and execution plan will go wrong in a short time until persistent recalculation requested.

 /* Persistent recalculation requested, called from
      1) ANALYZE TABLE, or
      2) the auto recalculation background thread, or
      3) open table if stats do not exist on disk and auto recalc
         is enabled */

repro:

after row-copy complete,query 2 sqls, show index from _xx_gho and mysql.innodb_index_stats, and after cut-over , query again.

first stage < after row-copy complete >:

2020-08-17 10:31:48 INFO Row copy complete
2020-08-17 10:31:48 INFO show index from `sthdb`.`_prd_sth_unit_gho`
                      Key_name|         Column_name|Cardinality|
                       PRIMARY|                  ID|        11|
       idx_DataChange_LastTime| DataChange_LastTime|        11|
2020-08-17 10:31:48 INFO query innodb_index_stats
                    table_name|                    index_name|         last_update|      stat_name|stat_value|sample_size|
            _prd_sth_unit_gho|                       PRIMARY| 2020-08-17 10:31:40|   n_diff_pfx01|         0|         1|
            _prd_sth_unit_gho|                       PRIMARY| 2020-08-17 10:31:40|   n_leaf_pages|         1|         1|
            _prd_sth_unit_gho|                       PRIMARY| 2020-08-17 10:31:40|           size|         1|         1|
            _prd_sth_unit_gho|       idx_DataChange_LastTime| 2020-08-17 10:31:40|   n_diff_pfx01|         0|         1|
            _prd_sth_unit_gho|       idx_DataChange_LastTime| 2020-08-17 10:31:40|   n_diff_pfx02|         0|         1|
            _prd_sth_unit_gho|       idx_DataChange_LastTime| 2020-08-17 10:31:40|   n_leaf_pages|         1|         1|
            _prd_sth_unit_gho|       idx_DataChange_LastTime| 2020-08-17 10:31:40|           size|         1|         1|

second stage < after row-copy complete >:

2020-08-17 10:31:50 INFO query innodb_index_stats
                    table_name|                    index_name|         last_update|      stat_name|stat_value|sample_size|
                 prd_sth_unit|                       PRIMARY| 2020-08-17 10:31:40|   n_diff_pfx01|         0|         1|
                 prd_sth_unit|                       PRIMARY| 2020-08-17 10:31:40|   n_leaf_pages|         1|         1|
                 prd_sth_unit|                       PRIMARY| 2020-08-17 10:31:40|           size|         1|         1|
                 prd_sth_unit|       idx_DataChange_LastTime| 2020-08-17 10:31:40|   n_diff_pfx01|         0|         1|
                 prd_sth_unit|       idx_DataChange_LastTime| 2020-08-17 10:31:40|   n_diff_pfx02|         0|         1|
                 prd_sth_unit|       idx_DataChange_LastTime| 2020-08-17 10:31:40|   n_leaf_pages|         1|         1|
                 prd_sth_unit|       idx_DataChange_LastTime| 2020-08-17 10:31:40|           size|         1|         1|
                 2020-08-17 10:31:50 INFO show index from `sthdb`.`prd_sth_unit`
                      Key_name|         Column_name|Cardinality|
                       PRIMARY|                  ID|         0|
       idx_DataChange_LastTime| DataChange_LastTime|         0|

solution:
Add analyze table operation (on the _gho table) after row copy complete, the statistics would be persisted then cut-over table.

Thanks.

@timvaillancourt
Copy link
Collaborator

timvaillancourt commented Aug 17, 2020

Add analyze table operation (on the _gho table) after row copy complete, the statistics would be persisted then cut-over table.

@cenkore 👋 thanks for reporting this, I think this solution makes sense 👍

Initially I was concerned about the overhead of running ANALYZE TABLE on a large table, however I'm seeing it run in 0.16 sec on MySQL 5.7.29 with a 1.4TB table (very fast!):

mysql> ANALYZE TABLE <REDACTED>;
+--------------------------------+---------+----------+----------+
| Table                          | Op      | Msg_type | Msg_text |
+--------------------------------+---------+----------+----------+
| <REDACTED>                     | analyze | status   | OK       |
+--------------------------------+---------+----------+----------+
1 row in set (0.16 sec)

I've also confirmed that the index stats are persisted through a RENAME TABLE. I should have some free cycles to implement this in the near future (no ETA) unless others would like to

cc @shlomi-noach

@timvaillancourt
Copy link
Collaborator

timvaillancourt commented Aug 17, 2020

As discussed with @tomkrouper (offline): if the server running the cut-over is read_only=1 (a replica), SET SQL_LOG_BIN=0 should be ran before ANALYZE TABLE to avoid an errant GTID in replication

In this case only the replica that ran ANALYZE TABLE would refresh innodb stats, no replicas below it. I think that's an acceptable limitation as running gh-ost cut-overs on replicas is typically for testing and no-ops

This logic would work if the server running the cut-over is read_only=0 (a primary), which is probably the most common case. In this case the replicas and primary would refresh innodb stats

@timvaillancourt timvaillancourt self-assigned this Aug 19, 2020
@druud
Copy link
Contributor

druud commented Oct 17, 2020

As discussed with @tomkrouper (offline): if the server running the cut-over is read_only=1 (a replica), SET SQL_LOG_BIN=0 should be ran before ANALYZE TABLE to avoid an errant GTID in replication

https://dev.mysql.com/doc/refman/5.7/en/analyze-table.html
also offers LOCAL (or NO_WRITE_TO_BINLOG), which probably achieves the same.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants