Skip to content

miscompiled TLS wrapper, likely coroutine related, with sanitizer #91312

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
avikivity opened this issue May 7, 2024 · 28 comments · Fixed by #91483
Closed

miscompiled TLS wrapper, likely coroutine related, with sanitizer #91312

avikivity opened this issue May 7, 2024 · 28 comments · Fixed by #91483
Assignees
Labels
ipo Interprocedural optimizations miscompilation release:backport

Comments

@avikivity
Copy link
Contributor

avikivity commented May 7, 2024

Prior to 533b7c1, compiling some coroutine that references a TLS variable generates this wrapper:

0000000000029d80 <_ZTWN2db13schema_tablesL14the_merge_lockE>:
   29d80: 50                           	pushq	%rax
   29d81: e8 4a b3 10 00               	callq	0x1350d0 <__tls_init>
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

It correctly calls __tls_init.

With 533b7c1 and later, up to 18.1.1, the following TLS wrapper is generated:

0000000000029d80 <_ZTWN2db13schema_tablesL14the_merge_lockE>:
   29d80: 50                           	pushq	%rax
   29d81: e8 00 00 00 00               	callq	0x29d86 <_ZTWN2db13schema_tablesL14the_merge_lockE+0x6>
		0000000000029d82:  R_X86_64_PLT32	_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE-0x4
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

The call to __tls_init was replaced by a call to some random function. When the coroutine is then called, it does not initialize the object.

I will follow up with a full reproducer.

@avikivity
Copy link
Contributor Author

Copying @dianqk, the author of the patch.

@avikivity avikivity changed the title miscompile TLS wrapper, likely coroutine related, with sanitizer miscompiled TLS wrapper, likely coroutine related, with sanitizer May 7, 2024
@tchaikov
Copy link
Contributor

tchaikov commented May 7, 2024

another data point is that if the source is compiled with -O0, the issue goes away. i guess the reason is that -O0 disables the globalopt pass.

@avikivity
Copy link
Contributor Author

Reproducer:

wget https://scratch.scylladb.com/schema_tables.ii


clang++ -MD -MT build/debug/db/schema_tables.o -MF build/debug/db/schema_tables.o.d -std=c++20 -I/home/avi/scylla/seastar/include -I/home/avi/scylla/build/debug/seastar/gen/include -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -DSEASTAR_API_LEVEL=7 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_SSTRING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_DEBUG -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEBUG_PROMISE -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_TYPE_ERASE_MORE -DFMT_SHARED -I/usr/include/p11-kit-1 -ffile-prefix-map=/home/avi/scylla=. -march=westmere -DDEBUG -DSANITIZE -DDEBUG_LSA_SANITIZER -DSCYLLA_ENABLE_ERROR_INJECTION -Og -DSCYLLA_BUILD_MODE=debug -g -gz -iquote. -iquote build/debug/gen -std=gnu++20  -ffile-prefix-map=/home/avi/scylla=. -march=westmere -DBOOST_ALL_DYN_LINK    -fvisibility=hidden -isystem abseil -Wall -Werror -Wextra -Wimplicit-fallthrough -Wno-mismatched-tags -Wno-c++11-narrowing -Wno-overloaded-virtual -Wno-unused-parameter -Wno-unsupported-friend -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-psabi -Wno-enum-constexpr-conversion -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN  -S  schema_tables.ii -o schema_tables.s

grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

@dianqk
Copy link
Member

dianqk commented May 7, 2024

Thanks. I can reproduce it. The bisected outcome might not be the issue itself, but I'll investigate this first.

@avikivity
Copy link
Contributor Author

Simplified command line:

clang++    -fsanitize=address       -Og    -std=gnu++20       -Wno-c++11-narrowing -Wno-unsupported-friend  -S  schema_tables.ii -o schema_tables.s && grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

I verified that -fsanitize=address is required.

@avikivity
Copy link
Contributor Author

Yes it's probably coroutines not emitting llvm.used. I remember bugs in this area.

@EugeneZelenko EugeneZelenko added coroutines C++20 coroutines and removed new issue labels May 7, 2024
@llvmbot
Copy link
Member

llvmbot commented May 7, 2024

@llvm/issue-subscribers-coroutines

Author: Avi Kivity (avikivity)

Prior to 533b7c1, compiling some coroutine that references a TLS variable generates this wrapper:
0000000000029d80 &lt;_ZTWN2db13schema_tablesL14the_merge_lockE&gt;:
   29d80: 50                           	pushq	%rax
   29d81: e8 4a b3 10 00               	callq	0x1350d0 &lt;__tls_init&gt;
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

It correctly calls __tls_init.

With 533b7c1 and later, up to 18.1.1, the following TLS wrapper is generated:

0000000000029d80 &lt;_ZTWN2db13schema_tablesL14the_merge_lockE&gt;:
   29d80: 50                           	pushq	%rax
   29d81: e8 00 00 00 00               	callq	0x29d86 &lt;_ZTWN2db13schema_tablesL14the_merge_lockE+0x6&gt;
		0000000000029d82:  R_X86_64_PLT32	_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE-0x4
   29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000029d92:  R_X86_64_TPOFF32	_ZN2db13schema_tablesL14the_merge_lockE
   29d96: 59                           	popq	%rcx
   29d97: c3                           	retq
   29d98: 0f 1f 84 00 00 00 00 00      	nopl	(%rax,%rax)

The call to __tls_init was replaced by a call to some random function. When the coroutine is then called, it does not initialize the object.

I will follow up with a full reproducer.

@dianqk
Copy link
Member

dianqk commented May 7, 2024

Simplified command line:

clang++    -fsanitize=address       -Og    -std=gnu++20       -Wno-c++11-narrowing -Wno-unsupported-friend  -S  schema_tables.ii -o schema_tables.s && grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

I verified that -fsanitize=address is required.

It should also need -fvisibility=hidden.

@dianqk
Copy link
Member

dianqk commented May 7, 2024

I haven't found any obvious issues, unless I missed something.

_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE is an alias for __tls_init:

@_ZTHN2db13schema_tablesL14the_merge_lockE = internal alias void (), ptr @__tls_init
@_ZTH15data_type_for_vIN7seastar13basic_sstringIcjLj15ELb1EEEE = linkonce_odr alias void (), ptr @__tls_init
@_ZTH15data_type_for_vIiE = linkonce_odr alias void (), ptr @__tls_init

Perhaps we need a runtime issue reproduction?

@avikivity
Copy link
Contributor Author

Aha. The runtime reproducer is quite heavy. If you want, I can provide a pre-built docker image and instructions to expose the problem with gdb.

@avikivity
Copy link
Contributor Author

My guess is that it's related. The compiler lost some information about TLS, so it started aliasing __tls_init to some other function. I'm single-stepping through it now, and I'm seeing that it's not initializing the variable.

@avikivity
Copy link
Contributor Author

I think what happens is that the guard variable is set when __tls_init (or its alias) is called, so the initializers aren't run.

I set a watchpoint on the variable:

(gdb) bt
#0  0x0000000003fa2dc7 in __tls_init ()
#1  0x0000000003fc2ff6 in TLS wrapper function for data_type_for_v<utils::UUID> ()
#2  0x0000000003fc16c4 in data_type_for<utils::UUID> () at ./types/types.hh:728
#3  cql3::untyped_result_set_row::get_as<utils::UUID> (this=<optimized out>, name=...) at ./cql3/untyped_result_set.hh:71
#4  0x00000000062f5a13 in db::system_keyspace::load_local_info (this=<optimized out>) at ./db/system_keyspace.cc:1431
#5  0x000000000634e487 in std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<db::system_keyspace::local_info>::promise_type>::resume (this=<optimized out>)
    at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/coroutine:242
#6  seastar::internal::coroutine_traits_base<db::system_keyspace::local_info>::promise_type::run_and_dispose (this=<optimized out>) at ./seastar/include/seastar/core/coroutine.hh:83
#7  0x00007ffff616cd9f in seastar::reactor::run_tasks (this=0x52400001c100, tq=...) at ./build/debug/seastar/./seastar/src/core/reactor.cc:2690
#8  0x00007ffff6178c76 in seastar::reactor::run_some_tasks (this=0x52400001c100) at ./build/debug/seastar/./seastar/src/core/reactor.cc:3152
#9  0x00007ffff617e0ba in seastar::reactor::do_run (this=0x52400001c100) at ./build/debug/seastar/./seastar/src/core/reactor.cc:3320
#10 0x00007ffff617c14a in seastar::reactor::run (this=0x52400001c100) at ./build/debug/seastar/./seastar/src/core/reactor.cc:3210
#11 0x00007ffff5d0e3e1 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=0x7fffecc110f0, ac=7, av=0x7fffffffd618, func=...) at ./build/debug/seastar/./seastar/src/core/app-template.cc:276
#12 0x00007ffff5d0bd6b in seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) (this=0x7fffecc110f0, ac=7, av=0x7fffffffd618, func=...) at ./build/debug/seastar/./seastar/src/core/app-template.cc:167
#13 0x0000000002e8f2ba in scylla_main (ac=<optimized out>, av=<optimized out>) at ./main.cc:677
#14 0x0000000002e8d348 in std::function<int (int, char**)>::operator()(int, char**) const (this=0x7fffeca0a510, __args=<optimized out>, __args=<optimized out>)
    at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:591
#15 main (ac=7, av=<optimized out>) at ./main.cc:2162

So some unrelated __tls_init set it.

@avikivity
Copy link
Contributor Author

Confirmed. So a local __tls_init is generated, but then another gets called, perhaps due to the aliasing.

@avikivity
Copy link
Contributor Author

I verified that a binary built before the patch works, and after the patch fails. So the bisection is still accurate.

@avikivity
Copy link
Contributor Author

diff of the generated assembly:

$ diff -u schema_tables.s.full.{good,bad}
--- schema_tables.s.full.good	2024-05-07 20:02:55.038613957 +0300
+++ schema_tables.s.full.bad	2024-05-07 20:50:43.162254156 +0300
@@ -396,7 +396,6 @@
 0000000000000f40 l     O .rodata	0000000000000060 __PRETTY_FUNCTION__._ZN2db13schema_tables15hold_merge_lockEv
 0000000000000000 l    d  .text._ZN7seastar9get_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEENS_6futureINS_15semaphore_unitsIT_T0_EEEERNS_15basic_semaphoreIS7_S8_EEm	0000000000000000 .text._ZN7seastar9get_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEENS_6futureINS_15semaphore_unitsIT_T0_EEEERNS_15basic_semaphoreIS7_S8_EEm
 0000000000003c5a l     O .rodata.str1.1	0000000000000039 .L___asan_gen_.4441
-00000000001350d0 l     F .text	000000000000011c __tls_init
 0000000000000138 l       .tbss	0000000000000050 db::schema_tables::the_merge_lock
 0000000000003c93 l     O .rodata.str1.1	000000000000006b .L___asan_gen_.4442
 000000000018d900 l     F .text	0000000000000778 db::schema_tables::merge_schema(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&, std::vector<mutation, std::allocator<mutation>>, bool) (.resume)
@@ -7220,6 +7219,7 @@
 0000000000000000  w    F .text._ZN7seastar20noncopyable_functionIFNS_6futureINS_15semaphore_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEEEEvEED2Ev	00000000000000ce .hidden seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>::~noncopyable_function()
 0000000000000000  w    F .text._ZN7seastar20noncopyable_functionIFNS_6futureINS_15semaphore_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEEEEvEEC2EOSA_	0000000000000112 .hidden seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>::noncopyable_function(seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>&&)
 0000000000000000  w    F .text._ZN7seastar6futureIvE14then_impl_nrvoINS_20noncopyable_functionIFNS0_INS_15semaphore_unitsINS_35semaphore_default_exception_factoryENSt6chrono3_V212steady_clockEEEEEvEEESA_EET0_OT_	00000000000001cd .hidden seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>, seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>>>(seastar::noncopyable_function<seastar::future<seastar::semaphore_units<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>> ()>&&)
+00000000001350d0  w    F .text	000000000000011c .hidden thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>
 0000000000029da0 g     F .text	00000000000019a7 .hidden db::schema_tables::merge_schema(seastar::sharded<db::system_keyspace>&, seastar::sharded<service::storage_proxy>&, gms::feature_service&, std::vector<mutation, std::allocator<mutation>>, bool)
 0000000000000000         *UND*	0000000000000000 freeze(std::vector<mutation, std::allocator<mutation>> const&)
 0000000000000000  w    F .text._ZN7seastar8internal13futurize_baseIvE21make_exception_futureINSt15__exception_ptr13exception_ptrEEENS_6futureIvEEOT_	00000000000000ca .hidden seastar::future<void> seastar::internal::futurize_base<void>::make_exception_future<std::__exception_ptr::exception_ptr>(std::__exception_ptr::exception_ptr&&)
@@ -57984,7 +57984,8 @@
 
 0000000000029d80 <thread-local wrapper routine for db::schema_tables::the_merge_lock>:
    29d80: 50                           	pushq	%rax
-   29d81: e8 4a b3 10 00               	callq	0x1350d0 <__tls_init>
+   29d81: e8 00 00 00 00               	callq	0x29d86 <thread-local wrapper routine for db::schema_tables::the_merge_lock+0x6>
+		0000000000029d82:  R_X86_64_PLT32	thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>-0x4
    29d86: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
    29d8f: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
 		0000000000029d92:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
@@ -336033,11 +336034,11 @@
   1350c5: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
   1350cf: 90                           	nop
 
-00000000001350d0 <__tls_init>:
+00000000001350d0 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>>:
   1350d0: 53                           	pushq	%rbx
   1350d1: 64 80 3c 25 00 00 00 00 00   	cmpb	$0x0, %fs:0x0
 		00000000001350d5:  R_X86_64_TPOFF32	__tls_guard
-  1350da: 74 02                        	je	0x1350de <__tls_init+0xe>
+  1350da: 74 02                        	je	0x1350de <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xe>
   1350dc: 5b                           	popq	%rbx
   1350dd: c3                           	retq
   1350de: 64 c6 04 25 00 00 00 00 01   	movb	$0x1, %fs:0x0
@@ -336047,7 +336048,7 @@
 		00000000001350f3:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
   1350f7: 48 c1 e8 03                  	shrq	$0x3, %rax
   1350fb: 80 b8 00 80 ff 7f 00         	cmpb	$0x0, 0x7fff8000(%rax)
-  135102: 0f 85 a5 00 00 00            	jne	0x1351ad <__tls_init+0xdd>
+  135102: 0f 85 a5 00 00 00            	jne	0x1351ad <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xdd>
   135108: 64 48 c7 04 25 00 00 00 00 01 00 00 00       	movq	$0x1, %fs:0x0
 		000000000013510d:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
   135115: 64 48 8b 1c 25 00 00 00 00   	movq	%fs:0x0, %rbx
@@ -336055,13 +336056,13 @@
 		0000000000135121:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x8
   135125: ba 38 00 00 00               	movl	$0x38, %edx
   13512a: 31 f6                        	xorl	%esi, %esi
-  13512c: e8 00 00 00 00               	callq	0x135131 <__tls_init+0x61>
+  13512c: e8 00 00 00 00               	callq	0x135131 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x61>
 		000000000013512d:  R_X86_64_PLT32	__asan_memset-0x4
   135131: 48 8d 83 00 00 00 00         	leaq	(%rbx), %rax
 		0000000000135134:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x40
   135138: 48 c1 e8 03                  	shrq	$0x3, %rax
   13513c: 80 b8 00 80 ff 7f 00         	cmpb	$0x0, 0x7fff8000(%rax)
-  135143: 75 7d                        	jne	0x1351c2 <__tls_init+0xf2>
+  135143: 75 7d                        	jne	0x1351c2 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xf2>
   135145: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   13514e: 48 8d 88 00 00 00 00         	leaq	(%rax), %rcx
 		0000000000135151:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
@@ -336071,37 +336072,37 @@
 		0000000000135161:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x48
   135165: 48 c1 e8 03                  	shrq	$0x3, %rax
   135169: 80 b8 00 80 ff 7f 00         	cmpb	$0x0, 0x7fff8000(%rax)
-  135170: 75 65                        	jne	0x1351d7 <__tls_init+0x107>
+  135170: 75 65                        	jne	0x1351d7 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x107>
   135172: 64 48 c7 04 25 00 00 00 00 00 00 00 00       	movq	$0x0, %fs:0x0
 		0000000000135177:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x48
   13517f: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   135188: 48 8d b0 00 00 00 00         	leaq	(%rax), %rsi
 		000000000013518b:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
-  13518f: 48 8d 3d 00 00 00 00         	leaq	(%rip), %rdi            # 0x135196 <__tls_init+0xc6>
+  13518f: 48 8d 3d 00 00 00 00         	leaq	(%rip), %rdi            # 0x135196 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xc6>
 		0000000000135192:  R_X86_64_PC32	seastar::basic_semaphore<seastar::semaphore_default_exception_factory, std::chrono::_V2::steady_clock>::~basic_semaphore()-0x4
-  135196: 48 8d 15 00 00 00 00         	leaq	(%rip), %rdx            # 0x13519d <__tls_init+0xcd>
+  135196: 48 8d 15 00 00 00 00         	leaq	(%rip), %rdx            # 0x13519d <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xcd>
 		0000000000135199:  R_X86_64_PC32	__dso_handle-0x4
-  13519d: e8 00 00 00 00               	callq	0x1351a2 <__tls_init+0xd2>
+  13519d: e8 00 00 00 00               	callq	0x1351a2 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xd2>
 		000000000013519e:  R_X86_64_PLT32	__cxa_thread_atexit-0x4
-  1351a2: e8 00 00 00 00               	callq	0x1351a7 <__tls_init+0xd7>
+  1351a2: e8 00 00 00 00               	callq	0x1351a7 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xd7>
 		00000000001351a3:  R_X86_64_PLT32	.text.startup+0x28c
   1351a7: 5b                           	popq	%rbx
-  1351a8: e9 00 00 00 00               	jmp	0x1351ad <__tls_init+0xdd>
+  1351a8: e9 00 00 00 00               	jmp	0x1351ad <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xdd>
 		00000000001351a9:  R_X86_64_PLT32	.text.startup+0x2fc
   1351ad: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   1351b6: 48 8d b8 00 00 00 00         	leaq	(%rax), %rdi
 		00000000001351b9:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock
-  1351bd: e8 00 00 00 00               	callq	0x1351c2 <__tls_init+0xf2>
+  1351bd: e8 00 00 00 00               	callq	0x1351c2 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0xf2>
 		00000000001351be:  R_X86_64_PLT32	__asan_report_store8-0x4
   1351c2: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   1351cb: 48 8d b8 00 00 00 00         	leaq	(%rax), %rdi
 		00000000001351ce:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x40
-  1351d2: e8 00 00 00 00               	callq	0x1351d7 <__tls_init+0x107>
+  1351d2: e8 00 00 00 00               	callq	0x1351d7 <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x107>
 		00000000001351d3:  R_X86_64_PLT32	__asan_report_store8-0x4
   1351d7: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
   1351e0: 48 8d b8 00 00 00 00         	leaq	(%rax), %rdi
 		00000000001351e3:  R_X86_64_TPOFF32	db::schema_tables::the_merge_lock+0x48
-  1351e7: e8 00 00 00 00               	callq	0x1351ec <__tls_init+0x11c>
+  1351e7: e8 00 00 00 00               	callq	0x1351ec <thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x11c>
 		00000000001351e8:  R_X86_64_PLT32	__asan_report_store8-0x4
   1351ec: 0f 1f 40 00                  	nopl	(%rax)
 
@@ -763956,7 +763957,7 @@
 0000000000000000 <thread-local wrapper routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>>:
        0: 50                           	pushq	%rax
        1: e8 00 00 00 00               	callq	0x6 <thread-local wrapper routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>+0x6>
-		0000000000000002:  R_X86_64_PLT32	.text+0x1350cc
+		0000000000000002:  R_X86_64_PLT32	thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>-0x4
        6: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
 		000000000000000b:  R_X86_64_TPOFF32	data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>
        f: 59                           	popq	%rcx
@@ -763967,7 +763968,7 @@
 0000000000000000 <thread-local wrapper routine for data_type_for_v<int>>:
        0: 50                           	pushq	%rax
        1: e8 00 00 00 00               	callq	0x6 <thread-local wrapper routine for data_type_for_v<int>+0x6>
-		0000000000000002:  R_X86_64_PLT32	.text+0x1350cc
+		0000000000000002:  R_X86_64_PLT32	thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>-0x4
        6: 64 48 8b 04 25 00 00 00 00   	movq	%fs:0x0, %rax
 		000000000000000b:  R_X86_64_TPOFF32	data_type_for_v<int>
        f: 59                           	popq	%rcx

@avikivity
Copy link
Contributor Author

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

@avikivity
Copy link
Contributor Author

Is there a way to disable the pass as a workaround? I tried -mllvm -disablepass=globalopt, but it didn't work.

@dianqk
Copy link
Member

dianqk commented May 8, 2024

Is there a way to disable the pass as a workaround? I tried -mllvm -disablepass=globalopt, but it didn't work.

AFAIK, there is no way to disable arbitrary passes.

@dianqk
Copy link
Member

dianqk commented May 8, 2024

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

Thank you for your analysis. I will continue to look into this issue tonight.

@avikivity
Copy link
Contributor Author

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

Thank you for your analysis. I will continue to look into this issue tonight.

Thanks!

@dianqk
Copy link
Member

dianqk commented May 8, 2024

My explanation: the patch changed a local symbol __tls_init to a global weak symbol thread-local initialization routine for data_type_for_v<seastar::basic_sstring<char, unsigned int, 15u, true>>. Another translation unit also emitted the same symbol (but with different contents), and it got linked into the executable in preference to ours.

Agreed, I believe this is key to the issue. This is a minimal reproducer: https://llvm.godbolt.org/z/n5berxjfo.

@f2 = linkonce_odr hidden alias void (), ptr @f1

define void @foo() {
  call void @f1()
  ret void
}

define void @bar() {
  call void @f2()
  ret void
}

define internal void @f1() {
  ret void
}

IIUC, both LLVM 16 and LLVM 17 miscompiled. I expect that the call to f1 in foo can be overridden at link time, while the f2 in bar cannot.

I've created a draft PR #91483, but I still need some time to confirm this PR. Feel free giving it a try to see if everything is working as expected. :)

@avikivity
Copy link
Contributor Author

The original reproducer now generates the call to __tls_init. I'll test the executable later.

@avikivity
Copy link
Contributor Author

The executable also works.

@dianqk dianqk self-assigned this May 9, 2024
@dianqk dianqk added miscompilation and removed coroutines C++20 coroutines labels May 9, 2024
@dianqk dianqk added this to the LLVM 18.X Release milestone May 16, 2024
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status May 16, 2024
@tstellar tstellar moved this from Needs Triage to Needs Fix in LLVM Release Status May 16, 2024
dianqk added a commit that referenced this issue May 16, 2024
…91483)

Fixes #91312.

Don't perform the transform if the alias may be replaced at link time.
@github-project-automation github-project-automation bot moved this from Needs Fix to Done in LLVM Release Status May 16, 2024
@EugeneZelenko EugeneZelenko added ipo Interprocedural optimizations and removed llvm:optimizations labels May 16, 2024
@dianqk
Copy link
Member

dianqk commented May 16, 2024

/cherry-pick c796900

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue May 16, 2024
…lvm#91483)

Fixes llvm#91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c796900)
@llvmbot
Copy link
Member

llvmbot commented May 16, 2024

/pull-request #92468

@MaskRay
Copy link
Member

MaskRay commented May 17, 2024

Simplified command line:

clang++    -fsanitize=address       -Og    -std=gnu++20       -Wno-c++11-narrowing -Wno-unsupported-friend  -S  schema_tables.ii -o schema_tables.s && grep -A10 '_ZTWN2db13schema_tablesL14the_merge_lockE.*:' schema_tables.s

I verified that -fsanitize=address is required.

#92468 should be safe, but I have a question about suppressing the optimization for linkonce_odr/weak_odr, so I want to compile this example myself. However, trunk clang currently crashes

 #9 0x000055bcabb61095 clang::NestedNameSpecifier::print(llvm::raw_ostream&, clang::PrintingPolicy const&, bool) const (/tmp/Rel/bin/clang+++0xab55095)
#10 0x000055bcabb60d40 clang::NestedNameSpecifier::print(llvm::raw_ostream&, clang::PrintingPolicy const&, bool) const (/tmp/Rel/bin/clang+++0xab54d40)
#11 0x000055bcab381dd8 (anonymous namespace)::FailedBooleanConditionPrinterHelper::handledStmt(clang::Stmt*, llvm::raw_ostream&) SemaTemplate.cpp:0:0
#12 0x000055bcabc0be88 (anonymous namespace)::StmtPrinter::PrintExpr(clang::Expr*) StmtPrinter.cpp:0:0
#13 0x000055bcabbf85ef (anonymous namespace)::StmtPrinter::VisitUnaryOperator(clang::UnaryOperator*) StmtPrinter.cpp:0:0
#14 0x000055bcabbf7172 clang::Stmt::printPretty(llvm::raw_ostream&, clang::PrinterHelper*, clang::PrintingPolicy const&, unsigned int, llvm::StringRef, clang::ASTContext const*) const (/tmp/Rel/bin/clang+++0xabeb1
72)
#15 0x000055bcab2e558f clang::Sema::findFailedBooleanCondition[abi:cxx11](clang::Expr*) (/tmp/Rel/bin/clang+++0xa2d958f)
#16 0x000055bcab2e5d60 clang::Sema::CheckTemplateIdType(clang::TemplateName, clang::SourceLocation, clang::TemplateArgumentListInfo&) (/tmp/Rel/bin/clang+++0xa2d9d60)
#17 0x000055bcab4817ab clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformTemplateSpecializationType(clang::TypeLocBuilder&, clang::TemplateSpecializationTypeLoc, clang::TemplateName) Sema
TemplateInstantiate.cpp:0:0
#18 0x000055bcab489ab0 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformTemplateSpecializationType(clang::TypeLocBuilder&, clang::TemplateSpecializationTypeLoc) SemaTemplateInstantiate.c
pp:0:0
#19 0x000055bcab45a5ac clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeLocBuilder&, clang::TypeLoc) SemaTemplateInstantiate.cpp:0:0
#20 0x000055bcab486503 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformElaboratedType(clang::TypeLocBuilder&, clang::ElaboratedTypeLoc) SemaTemplateInstantiate.cpp:0:0
#21 0x000055bcab45a726 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeLocBuilder&, clang::TypeLoc) SemaTemplateInstantiate.cpp:0:0
#22 0x000055bcab45a136 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformType(clang::TypeSourceInfo*) SemaTemplateInstantiate.cpp:0:0
#23 0x000055bcab459d89 clang::Sema::SubstType(clang::TypeSourceInfo*, clang::MultiLevelTemplateArgumentList const&, clang::SourceLocation, clang::DeclarationName, bool) (/tmp/Rel/bin/clang+++0xa44dd89)
#24 0x000055bcab2f1e5e SubstDefaultTemplateArgument(clang::Sema&, clang::TemplateDecl*, clang::SourceLocation, clang::SourceLocation, clang::TemplateTypeParmDecl*, llvm::ArrayRef<clang::TemplateArgument>, llvm::Ar
rayRef<clang::TemplateArgument>) SemaTemplate.cpp:0:0
#25 0x000055bcab2f182f clang::Sema::SubstDefaultTemplateArgumentIfAvailable(clang::TemplateDecl*, clang::SourceLocation, clang::SourceLocation, clang::Decl*, llvm::ArrayRef<clang::TemplateArgument>, llvm::ArrayRef
<clang::TemplateArgument>, bool&) (/tmp/Rel/bin/clang+++0xa2e582f)

@MaskRay MaskRay reopened this May 17, 2024
@github-project-automation github-project-automation bot moved this from Done to Needs Triage in LLVM Release Status May 17, 2024
tstellar pushed a commit to llvmbot/llvm-project that referenced this issue May 17, 2024
…lvm#91483)

Fixes llvm#91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c796900)
@avikivity
Copy link
Contributor Author

Thanks a lot for the fix and backport.

@dianqk
Copy link
Member

dianqk commented May 20, 2024

I have created an issue for the assertion at #92757. Per #92619, we should be able to close this PR again.

@dianqk dianqk closed this as completed May 20, 2024
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status May 20, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
ipo Interprocedural optimizations miscompilation release:backport
Projects
Development

Successfully merging a pull request may close this issue.

7 participants