Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

MDEV-36122 Assertion failure ctx0->old_table->get_ref_count() == 1 in ha_innobase::commit_inplace_alter_table() #3889

Open
wants to merge 4 commits into
base: 10.11
Choose a base branch
from

Conversation

dr-m
Copy link
Contributor

@dr-m dr-m commented Mar 12, 2025

  • The Jira issue number for this PR is: MDEV-36122

Description

trx_purge_close_tables(): Before releasing any metadata locks (MDL), release all table references, in case an ALTER TABLE…ALGORITHM=COPY operation has confused our logic.

trx_purge_table_acquire(), trx_purge_table_open(): Do not acquire any table reference before successfully acquiring a necessary metadata lock. In this way, if purge is waiting for MDL, a concurrent ha_innobase::commit_inplace_alter_table(commit=true) that is holding a conflicting MDL_EXCLUSIVE will only observe its own reference on the table that it may need to replace.

dict_table_open_on_id(): Simplify the logic.

dict_stats: A helper for acquiring MDL and opening the tables mysql.innodb_table_stats and mysql.innodb_index_stats.

innodb_ft_aux_table_validate(): Contiguously hold dict_sys.latch while accessing the table that we open with dict_table_open_on_name().

lock_table_children(): Do not hold a table reference while invoking dict_acquire_mdl_shared<false>(), which may temporarily release and reacquire the shared dict_sys.latch that we are holding.

prepare_inplace_alter_table_dict(): If an unexpected reference to the table exists, wait for the purge subsystem to release its table handle, similar to how we would do in case FULLTEXT INDEX existed. This function is supposed to be protected by MDL_EXCLUSIVE on the table name. If purge is going to access the table again later during is ALTER TABLE operation, it will have access to an MDL compatible name for it and therefore should conflict with any MDL_EXCLUSIVE that would cover ha_innobase::commit_inplace_alter_table(commit=true).

ha_innobase::rename_table(): Before locking the data dictionary, ensure that the purge subsystem is not holding a reference to the table due to the lack of metadata locking, related to FULLTEXT INDEX or the row-level undo logging of ALTER IGNORE TABLE.

With these changes, no caller of dict_acquire_mdl_shared<false> should be holding a table reference.

All remaining calls to dict_table_open_on_name(dict_locked=false) except the one in fts_lock_table() and possibly in the DDL recovery predicate innodb_check_version() should be protected by MDL, but there currently is no assertion that would enforce this.

Release Notes

Some race conditions between InnoDB internal table access and DDL were fixed.

How can this PR be tested?

Stress test by the RQG grammar that reproduced the assertion failures.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

This is a 10.11 version of #3856.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

dr-m added 2 commits March 12, 2025 12:11
trx_purge_close_tables(): Before releasing any metadata locks (MDL),
release all table references, in case an ALTER TABLE…ALGORITHM=COPY
operation has confused our logic.

trx_purge_table_acquire(), trx_purge_table_open(): Do not acquire any
table reference before successfully acquiring a necessary metadata lock.
In this way, if purge is waiting for MDL, a concurrent
ha_innobase::commit_inplace_alter_table(commit=true) that is holding
a conflicting MDL_EXCLUSIVE will only observe its own reference on
the table that it may need to replace.

dict_acquire_mdl_shared<false>(): Unless we hold an MDL or one is
not needed, do not hold a table reference when releasing dict_sys.latch.
After loading a table into the dictionary cache, we will look up the
table again, because the table could be evicted or dropped while we
were not holding any dict_sys.latch.
dict_table_open_on_id(): Simplify the logic.

dict_stats: A helper for acquiring MDL and opening the tables
mysql.innodb_table_stats and mysql.innodb_index_stats.

innodb_ft_aux_table_validate(): Contiguously hold dict_sys.latch
while accessing the table that we open with dict_table_open_on_name().

lock_table_children(): Do not hold a table reference while invoking
dict_acquire_mdl_shared<false>(), which may temporarily release and
reacquire the shared dict_sys.latch that we are holding.

prepare_inplace_alter_table_dict(): If an unexpected reference to the
table exists, wait for the purge subsystem to release its table handle,
similar to how we would do in case FULLTEXT INDEX existed. This function
is supposed to be protected by MDL_EXCLUSIVE on the table name.
If purge is going to access the table again later during is ALTER TABLE
operation, it will have access to an MDL compatible name for it and
therefore should conflict with any MDL_EXCLUSIVE that would cover
ha_innobase::commit_inplace_alter_table(commit=true).

ha_innobase::rename_table(): Before locking the data dictionary,
ensure that the purge subsystem is not holding a reference to the
table due to the lack of metadata locking, related to FULLTEXT INDEX
or the row-level undo logging of ALTER IGNORE TABLE.

With these changes, no caller of dict_acquire_mdl_shared<false>
should be holding a table reference.

All remaining calls to dict_table_open_on_name(dict_locked=false)
except the one in fts_lock_table() and possibly in the DDL recovery
predicate innodb_check_version() should be protected by MDL, but there
currently is no assertion that would enforce this.
@dr-m dr-m self-assigned this Mar 12, 2025
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

trx_purge_attach_undo_recs(), purge_sys_t::batch_cleanup():
Clear purge_sys.m_active only after all table handles have been released.
@dr-m dr-m force-pushed the 10.11-MDEV-36122 branch from db8b8d2 to bffae9c Compare March 13, 2025 08:56
ha_innobase::truncate(): Before locking the data dictionary,
ensure that the purge subsystem is not holding a reference to the
table due to insufficient metadata locking related to an earlier
ALTER IGNORE TABLE operation.
# for free to join this conversation on GitHub. Already have an account? # to comment
Development

Successfully merging this pull request may close these issues.

3 participants