Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

MDEV-36316/MDEV-36327/MDEV-36328 Debug msan fixes 10.6 #3899

Open
wants to merge 4 commits into
base: 10.6
Choose a base branch
from

Conversation

grooverdan
Copy link
Member

  • The Jira issue number for this PR is: MDEV-36316/MDEV-36317/MDEV-36318

Description

Various fixing to allow a Debug MSAN with Clang-20 to pass tests.

See individual commit messages for details.

Release Notes

  • Internal changes only.

How can this PR be tested?

podman run --rm -ti -v "$PWD":/source:z --mount=type=tmpfs,tmpfs-size=10G,dst=/build --shm-size=10g --workdir /build --entrypoint /bin/bash --user buildbot --cap-add=SYS_PTRACE --privileged quay.io/mariadb-foundation/bb-worker:dev_debian12-msan-clang-20

cmake    -DWITH_EMBEDDED_SERVER=OFF \
                -DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \
                -DPLUGIN_{MROONGA,ROCKSDB,OQGRAPH,SPIDER}=NO \
                -DWITH_ZLIB=bundled \
                -DHAVE_LIBAIO_H=0 \
                -DCMAKE_DISABLE_FIND_PACKAGE_{URING,LIBAIO}=1 \
                -DWITH_NUMA=NO \
                -DWITH_SYSTEMD=no \
                -DWITH_MSAN=ON \
                -DHAVE_CXX_NEW=1 \
                -DCMAKE_{EXE,MODULE}_LINKER_FLAGS="-L${MSAN_LIBDIR} -Wl,-rpath=${MSAN_LIBDIR}" \
                -DCMAKE_CXX_FLAGS=-fsanitize=memory \
                -DWITH_DBUG_TRACE=OFF
                -DCMAKE_BUILD_TYPE=Debug \
                /source
cmake --build .
mysql-test/mtr --parallel=auto --force --big-test

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

In CMAKE_BUILD_TYPE=Debug the MSAN of clang-20.1 results in
MemorySanitizer: use-of-uninitialized-value on mach_read_from_2
called by rec_set_bit_field_2 (and likewise for the _1 equivalent).

The non-debug builds are assumed to optimize this down such that
this becomes just a setting of values.
@grooverdan grooverdan added the MariaDB Foundation Pull requests created by MariaDB Foundation label Mar 20, 2025
@grooverdan grooverdan requested a review from dr-m March 20, 2025 06:11
@grooverdan grooverdan changed the title MDEV-36316/MDEV-36317/MDEV-36318 Debug msan fixes 10.6 MDEV-36316/MDEV-36327/MDEV-36328 Debug msan fixes 10.6 Mar 20, 2025
Without this increase the mtr test case pre/post conditions will
fail as the stack usage has increased under MSAN with clang-20.1.

A partial success with 432K was achieved, however the 448K was needed
for test cases that changed default collation.

The resulting behaviour observed on smaller stack size was SEGV when
a function allocated memory from the stack, and the called another
function (potentially coincidenly memset - assuming common in early
functions post allocation).
@grooverdan grooverdan force-pushed the debug-msan-fixes-10.6 branch from 12611e9 to b445f66 Compare March 20, 2025 07:52
The function dict_process_sys_columns_rec left nth_v_col uninitialized
unless it was a virtual column. This was ok as the function
i_s_sys_columns_fill_table also didn't read this value unless it was a
virtual column.

As MSAN in clang-20 didn't follow this though, the pass by value
was changed to a pass by ptr so that MSAN could detect this correctly.
…n_range

ror_scan_selectivity passed an uninitialized page structure so
we shouldn't be using its values. btr_estimate_n_rows_in_range
doesn't use the page numbers in the tuples so these can be omitted.

While ror_scan_selectivity never uses the result, however the mrr calling
of records_in_range does use the result.
@grooverdan grooverdan force-pushed the debug-msan-fixes-10.6 branch from b445f66 to 5e9b106 Compare March 21, 2025 05:42
Comment on lines +159 to 164
#ifndef DBUG_OFF
MEM_MAKE_DEFINED(rec - offs, 1);
#endif
mach_write_to_1(rec - offs,
(mach_read_from_1(rec - offs) & ~mask)
| (val << shift));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect to me. Why would we claim that all bits at rec[-offs] are initialized when we are only overwriting some of the bits here? What would fail if this change and the similar change to rec_set_bit_field_2() were omitted?

Comment on lines -5456 to 5457
ulint nth_v_col, /*!< in: virtual column, its
ulint* nth_v_col, /*!< in: virtual column, its
sequence number (nth virtual col) */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why we would need any of the changes to this file and which problem these changes would solve. We’re no longer passing a read-only parameter by value but via a pointer that is effectively read-only. Can you test again without including any of these changes?

Comment on lines -14479 to +14480
btr_pos_t tuple1(range_start, mode1, pages->first_page);
btr_pos_t tuple2(range_end, mode2, pages->last_page);
btr_pos_t tuple1(range_start, mode1, 0);
btr_pos_t tuple2(range_end, mode2, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would seem to be the actual fix. ~0ULL might be a safer value, but I think that 0 should be OK as well, because the smallest possible index page number is 3.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
MariaDB Foundation Pull requests created by MariaDB Foundation
Development

Successfully merging this pull request may close these issues.

2 participants