valgrind BSD #67153

irevoire · 2019-12-08T13:07:56Z

Hello,
I’m trying to profile a simple program on FreeBSD 11.2 with rustc 1.39.0 and valgrind 3.10.1.

I always get the following error:

thread '<unnamed>' panicked at 'failed to allocate a guard page', src/libstd/sys/unix/thread.rs:336:17
stack backtrace:
   0:           0x116af8 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf24f776311cdf529
   1:           0x12eed0 - core::fmt::write::h01edf6dd68a42c9c
   2:           0x115174 - std::io::stdio::_print::hf62f3360234422f1
   3:           0x11870d - rust_oom
   4:           0x1183e3 - rust_oom
   5:           0x118d57 - std::panicking::rust_panic_with_hook::h9a662f58cf3f8ffe
   6:           0x118b36 - std::panicking::begin_panic_fmt::hb07c3a07faf6b5b3
   7:           0x119436 - std::rt::lang_start_internal::h7d3aa0ed326f9560
   8:           0x110dc3 -
   9:           0x110c6b -
  10:           0x110a25 -
  11:          0x4023000 - <unknown>
fatal runtime error: failed to initiate panic, error 94376704
error: valgrind command failed

The text was updated successfully, but these errors were encountered:

Mark-Simulacrum · 2019-12-08T13:29:40Z

This is at least plausibly a valgrind bug? Could you try with a more recent version (I have 3.15 locally, at least)?

In any case I suspect that since it works outside of valgrind (I assume, given you didn't mention otherwise) that this would be a bug inside valgrind.

irevoire · 2019-12-08T14:18:47Z

Yes it does works outside of valgrind, and valgrind is also working for other C program.

I don’t know how to get a more recent version valgrind, in the pre compiled package and in the port of FreeBSD the latest version of valgrind is 3.10.1.

And when I try to compile valgrind from source it does not work because at some point my cpu is not supported (amd64).

If you are also running under FreeBSD could you send me your valgrind binary?

Mark-Simulacrum · 2019-12-08T15:09:15Z

I am not running under FreeBSD. I suspect this bug is unlikely to make progress as-is; it sounds like it's probably a valgrind bug (missing emulation or lack of support for guard pages, something along those lines).

ranma42 · 2019-12-08T15:11:29Z

Would it be possible to disable guard pages to check if it makes a difference?
(with a compile-time option or an environment variable?)

Mark-Simulacrum · 2019-12-08T15:15:07Z

I think no (they're sort of inherently protecting you, i.e., a safety feature). But I'm not sure. You could plausibly comment out the stack guard support in libstd, I guess, and recompile -- https://github.com/rust-lang/rust/blob/master/src/libstd/sys/unix/stack_overflow.rs

nagisa · 2019-12-08T17:59:18Z

Guard pages are a prerequisite for ensuring soundness in case stack overflow occurs. C works, because it doesn’t do anything along the lines for you.

That being said, there seems to be more stuff going wrong here than just inability to map a page:

fatal runtime error: failed to initiate panic, error 94376704

Either way, this is more attributable to valgrind rather than Rust.

asomers · 2021-02-02T23:25:44Z

I opened a bug with Valgrind. We'll see what they say.
https://bugs.kde.org/show_bug.cgi?id=432440

paulfloyd · 2021-02-03T10:36:01Z

FreeBSD Valgrind maintainer here.

I'm not at all familiar with Rust. Is this code creating a stack for the main process or for a secondary thread?

With ktrace I'm seeing a lot of mmap'ing going on - much more than for 'grep'.

From what I understand for C/C++ applications, on startup mmap a MAP_GUARD region above the user stack, similar to what this code is doing but much larger and without needing to use mprotect. Valgrind is leaving space for this mapping, and I don't think that it likes client code mmap'ing into it.

paulfloyd · 2021-03-11T13:08:36Z

Can anyone comment on my previous question?

Is this happening on the main thread during program startup?
Why isn't rust using a MAP_GUARD region (designed for this purpose, or so it seems to me) rather than mmap'ing then calling mprotect?

paulfloyd · 2021-03-12T12:29:04Z

More questions:

What is Linux doing differently? Is it also mmap'ing a guard page? And if so, how does it determine the address to use?

My impression on FreeBSD is that the rust runtime is getting the user stack via the "kern.usrstack" sysctl. That's what I see with ktrace, but looking at the rust source I also see calls to pthread_attr_getstack.

The Valgrind code that allocates the user stack is here starting on line 1048. The FreeBSD code does the same, though obviously there are differences in the setup_client_stack function.

I've started doing some tests to see if I can intercept the sysctl for kern.usrstack to return a value that is correct for the guest.

paulfloyd · 2021-03-13T14:15:21Z

A few tests on Linux. For a C++ executable, when running standalone the user stack is around 0x7ffeffffff. When running under Valgrind 0x7ffeffff00 is where VG's stack is and the guests stack is around 0x1ffeffffff. With strace I see the following 4k mmaps (I'm assuming the stack guard is 4k).

That fits with what I see with the standalone C++ exe.

Under Valgrind

This time I don't see an mmap near where the user stack is.

asomers · 2021-03-27T02:36:09Z

@paulfloyd I'm not highly familiar with the internals of the Rust compiler, but I can easily answer one of your questions:

Rust does not use MAP_GUARD because that's a FreeBSD-specific feature, and it's younger than Rust's stack overflow detection code. It wasn't added until FreeBSD 11.1. We could change it, but IIRC Rust currently tries to support back to FreeBSD 10.0 or something. I disagree with that, but it's not up to me to change it.

paulfloyd · 2021-03-28T10:29:05Z

Fair enough.

However, I still don't have answers to many of my questions.

What is being done differently on Linux?

How do you determine the address for this guard page on FreeBSD?

Without this information it is going to be difficult for me to progress.

asomers · 2021-03-28T13:48:05Z

Here are the two files that deal with guard pages.
https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/stack_overflow.rs
https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/thread.rs

asomers · 2021-03-29T05:23:41Z

@paulfloyd I did some work on the guard page stuff this weekend. It's not my area of expertise, but I'm making some progress in fixing stack overflow detection. Fingers crossed, that might fix Valgrind too.

paulfloyd · 2021-03-29T06:08:15Z

On 3/29/21 5:23 AM, Alan Somers wrote: @paulfloyd <https://github.com/paulfloyd> I did some work on the guard page stuff this weekend. It's not my area of expertise, but I'm making some progress in fixing stack overflow detection. Fingers crossed, that might fix Valgrind too.

Based on the rust source links that you sent I wrote the following C++ snippet: #include <iostream> #include <pthread.h> #include <pthread_np.h> int main() { int a; std::cout << " a addr " << &a << '\n'; pthread_attr_t attr; pthread_attr_init(&attr); pthread_attr_get_np(pthread_self(), &attr); void* stackaddr; size_t stacksize; pthread_attr_getstack(&attr, &stackaddr, &stacksize); std::cout << "stack addr " << stackaddr << " stack size " << stacksize << '\n'; } Running it standalone produces a addr 0x7fffffffe3cc stack addr 0x7fffdffff000 stack size 536870912 This seems consistent. The address of the local on the stack in main is close to the address of the stack obtained via pthread_attr_stack. Running it under Valgrind produces a addr 0x7fc00045c stack addr 0x7fffdffff000 stack size 536870912 That's not good! The values returned by pthread_attr_stack haven't changed, which means that they correspond to the Valgrind host stack rather than the guest stack. The pthread_attr_get_np and pthread_attr_getstack functions are just reading the thread attributes. I need to do more debugging and reading of the thread and startup code to understand exactly how the stack address is generated and read. On the Valgrind side, reading these values needs to be intercepted in some way so that the values are correct for the guest. I'll also have a go on Linux to see if it is handling this correctly. On FreeBSD, pthread_attr_get_np() just copies out the thread's pthread_attr structure which contains the stack address and size. The FreeBSD libc code that creates the main thread is in /usr/src/lib/libthr/thread/thr_init.c, init_main_thread ``` /* * Set up the thread stack. * * Create a red zone below the main stack. All other stacks * are constrained to a maximum size by the parameters * passed to mmap(), but this stack is only limited by * resource limits, so this stack needs an explicitly mapped * red zone to protect the thread stack that is just beyond. */ if (mmap(_usrstack - _thr_stack_initial - _thr_guard_default, _thr_guard_default, 0, MAP_ANON, -1, 0) == MAP_FAILED) PANIC("Cannot allocate red zone for initial thread"); /* * Mark the stack as an application supplied stack so that it * isn't deallocated. * * XXX - I'm not sure it would hurt anything to deallocate * the main thread stack because deallocation doesn't * actually free() it; it just puts it in the free * stack queue for later reuse. */ thread->attr.stackaddr_attr = _usrstack - _thr_stack_initial; thread->attr.stacksize_attr = _thr_stack_initial; thread->attr.guardsize_attr = _thr_guard_default; thread->attr.flags |= THR_STACK_USER; ``` _usrstack is a global pointer to char size_t _thr_stack_initial = THR_STACK_INITIAL; If I understand correctly, _usrstack is set from the kern.usrstack sysctl: ``` if (sysctl(mib, 2, &_usrstack, &len, NULL, 0) == -1) PANIC("Cannot get kern.usrstack from sysctl") ``` and depending on a couple of env vars _thr_stack_initial comes from getrlimit(RLIMIT_STACK, &rlim)

paulfloyd · 2021-03-29T06:41:48Z

Quick test on Linux - seems OK.

paulf> ./sa
a addr 0x7ffd6fab4aec
stack addr 0x7ffd6f2b7000 stack size 8380416

paulf> valgrind -q ./sa
a addr 0x1ffefffe0c
stack addr 0x1ffe801000 stack size 8384512

The glibc code to get the pthread attributes is a lot more complicated.

There seems to be a general case where it reads thread->stackblock. Possibly this is the case for secondary threads. Anyway, this branch isn't taken for this small example. Indeed, that is what the comment for the else block says. So for the initial thread it reads /proc/self/maps and also uses the internal variable __libc_stack_end.

There's a lot of code in VG to handle /proc/self/maps but I'm not familiiar with how it works.

paulfloyd · 2021-03-29T20:42:20Z

On 3/29/21 5:23 AM, Alan Somers wrote: @paulfloyd <https://github.com/paulfloyd> I did some work on the guard page stuff this weekend. It's not my area of expertise, but I'm making some progress in fixing stack overflow detection. Fingers crossed, that might fix Valgrind too.

I may be close to having a solution. Can you try the attached patch with my git repo?

paulfloyd · 2021-03-29T20:43:31Z

diff --git a/coregrind/m_main.c b/coregrind/m_main.c
index f3a0d1c27..56f9c6cbf 100644
--- a/coregrind/m_main.c
+++ b/coregrind/m_main.c
@@ -3843,6 +3843,13 @@ UWord voucher_mach_msg_set ( UWord arg1 )
 #endif
 
 
+Word VG_(get_usrstack)(void)
+{
+   return VG_PGROUNDDN(the_iicii.clstack_end - the_iifii.clstack_max_size);
+}
+
+
+
 /*--------------------------------------------------------------------*/
 /*--- end                                                          ---*/
 /*--------------------------------------------------------------------*/
diff --git a/coregrind/m_syswrap/syswrap-freebsd.c b/coregrind/m_syswrap/syswrap-freebsd.c
index 318721f62..b9508a4d4 100644
--- a/coregrind/m_syswrap/syswrap-freebsd.c
+++ b/coregrind/m_syswrap/syswrap-freebsd.c
@@ -1983,6 +1983,19 @@ PRE(sys___sysctl)
       }
    }
 
+   if (SARG2 >= 2 && ML_(safe_to_deref)(name, 2*sizeof(int))) {
+      if (name[0] == 1 && name[1] == 33) {
+         // kern.userstack
+         Word tmp = VG_(get_usrstack)();
+         size_t* out = (size_t*)ARG3;
+         size_t* outlen = (size_t*)ARG4;
+         *out = tmp;
+         *outlen = sizeof(size_t);
+         SET_STATUS_Success(0);
+      }
+   }
+
+
    PRE_REG_READ6(int, "__sysctl", int *, name, vki_u_int32_t, namelen, void *, oldp,
                  vki_size_t *, oldlenp, void *, newp, vki_size_t, newlen);
 
diff --git a/coregrind/pub_core_aspacemgr.h b/coregrind/pub_core_aspacemgr.h
index 0f34782d3..cf25699ca 100644
--- a/coregrind/pub_core_aspacemgr.h
+++ b/coregrind/pub_core_aspacemgr.h
@@ -384,6 +384,9 @@ extern Bool VG_(am_search_for_new_segment)(Addr *start, SizeT *size,
                                            UInt *prot);
 #endif
 
+extern Word VG_(get_usrstack)(void);
+
+
 #endif   // __PUB_CORE_ASPACEMGR_H
 
 /*--------------------------------------------------------------------*/

paulfloyd · 2021-03-31T09:41:42Z

Can you confirm whether the above patch works OK? It doesn't break any of the Valgrind amd64 regression tests, and 'rg' and a hello world rust exe both seem to work.

asomers · 2021-03-31T13:00:14Z

@paulfloyd I'll probably have time to test it this weekend.

asomers · 2021-04-07T02:09:21Z

@paulfloyd Your patch works for me. HOWEVER, even without your patch, valgrind works for me on executables built with the latest Rust compiler (rustc 1.53.0-nightly (d32238532 2021-04-05)). And that's true both for the valgrind in your GH repo and the valgrind-devel in ports. I think the problem is probably fixed by #83771 .

paulfloyd · 2021-04-07T07:10:54Z

I've integrated my fix to Valgrind as well.

asomers · 2021-04-07T11:48:02Z

Now that I see the full version of Paul's fix I understand why #83771 "fixed" the problem. After #83771 , Rustc no longer tries to add a guard page on FreeBSD. However, it still needs to know the address of the stack for the purposes of stack overflow detection. So without Paul's fix in paulfloyd/freebsd_valgrind@5923237 stack overflow detection won't work.

@jonas-schievink we can close this issue now.

jonas-schievink added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows O-freebsd Operating system: FreeBSD C-bug Category: This is a bug. labels Dec 8, 2019

asomers mentioned this issue Feb 3, 2021

Valgrind always crashes Rust programs on FreeBSD with "failed to allocate a guard page" paulfloyd/freebsd_valgrind#154

Closed

jonas-schievink closed this as completed Apr 7, 2021

valgrind BSD #67153

valgrind BSD #67153

Comments

irevoire commented Dec 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Mark-Simulacrum commented Dec 8, 2019

Uh oh!

irevoire commented Dec 8, 2019

Uh oh!

Mark-Simulacrum commented Dec 8, 2019

Uh oh!

ranma42 commented Dec 8, 2019

Uh oh!

Mark-Simulacrum commented Dec 8, 2019

Uh oh!

nagisa commented Dec 8, 2019

Uh oh!

asomers commented Feb 2, 2021

Uh oh!

paulfloyd commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfloyd commented Mar 11, 2021

Uh oh!

paulfloyd commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfloyd commented Mar 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asomers commented Mar 27, 2021

Uh oh!

paulfloyd commented Mar 28, 2021

Uh oh!

asomers commented Mar 28, 2021

Uh oh!

asomers commented Mar 29, 2021

Uh oh!

paulfloyd commented Mar 29, 2021 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfloyd commented Mar 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfloyd commented Mar 29, 2021 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfloyd commented Mar 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulfloyd commented Mar 31, 2021

Uh oh!

asomers commented Mar 31, 2021

Uh oh!

asomers commented Apr 7, 2021

Uh oh!

paulfloyd commented Apr 7, 2021

Uh oh!

asomers commented Apr 7, 2021

Uh oh!

irevoire commented Dec 8, 2019 •

edited

Loading

paulfloyd commented Feb 3, 2021 •

edited

Loading

paulfloyd commented Mar 12, 2021 •

edited

Loading

paulfloyd commented Mar 13, 2021 •

edited

Loading

paulfloyd commented Mar 29, 2021 via email •

edited

Loading

paulfloyd commented Mar 29, 2021 •

edited

Loading

paulfloyd commented Mar 29, 2021 via email •

edited

Loading

paulfloyd commented Mar 29, 2021 •

edited

Loading