-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
assertion failed: page && pm_slot_check_refcnt(*page->pg_tree_slot) #42
Comments
Hi -
On 2017-10-16 at 10:09 Dmitry Vyukov wrote:
I am getting the following crashes. Is it a know issue? If not and you don't see why it happens right away, I can try to create a reproducer.
I don't recognize this one. It looks like there are a couple issues.
I'd need to see a little of the ASM for the GP faults to know why
userspace is faulting.
The kernel panic is also interesting. My guess is a refcounting
problem. I might be able to figure it out from a backtrace ('bt' from
the monitor) and the value of *page->pg_tree_slot. But a reproducer
might be needed.
|
Here is the backtrace:
There is no source/line info in obj/kern/akaros-kernel, so I can't map this to lines. |
I failed to create a C reproducer. If I am reading this correctly, sys_exec is exec system call. Fuzzer itself does not call exec. So I wonder what calls exec. This probably explains why I can't create a standalone repro. Do you see from crash message what is the process that caused the panic?
|
This can be reproduced by running whole fuzzer, though. Repro steps:
This crashes kernel in <10 seconds for me. |
On 2017-10-16 at 18:09 Dmitry Vyukov ***@***.***> wrote:
I failed to create a C reproducer. If I am reading this correctly, sys_exec is exec system call. Fuzzer itself does not call exec. So I wonder what calls exec. This probably explains why I can't create a standalone repro. Do you see from crash message what is the process that caused the panic?
It looks like sh, since it was the process running at the time. Do you
have a bash script of some sort running to drive syz-executor?
As far as the backtrace goes, you can use:
addr2line -e obj/kern/akaros-kernel-64
Though in this case, it won't help much - I can see the codepath
regardless of line numbers. It looks like we're just failing to read a
file in generic_file_read().
|
On 2017-10-16 at 18:17 Dmitry Vyukov ***@***.***> wrote:
This can be reproduced by running whole fuzzer, though.
Thanks, I'll take a look.
|
What's strange is that all crashes mention RIP=0x0000000000400fff. But it does not point to any instruction in the binary. It's not mine init_cacheinfo function called somewhere from pthread. And it's also last byte of the first page of text section (which is paged in for the first time?). Maybe it rings any bells for you.
|
Hi - I was able to recreate the bug. I haven't solved it yet, but I have a bunch of leads. One minor thing: I had to change the CC variable in your Makefile for the executor to build it in my setup. This way should work for everyone:
$AKAROS_XCC_ROOT is an environment variable that everyone should use to point to the toolchain installation. I didn't need the ADDCFLAGS either, though maybe that's a peculiarity of my setup. The only other thing I do is put $(AKAROS_XCC_ROOT)/bin/ in my $PATH, though I don't see why that would help. Anyway, thanks for the bug - I'll post more when I solve it. |
Humm... If I remove ADDCFLAGS,
I've bootstrapped the toolchain using these commands:
How is your setup different? |
Re SOURCEDIR vs AKAROS_XCC_ROOT: |
We have two standard env variables. AKAROS_ROOT is the git repo you've downloaded. AKAROS_XCC_ROOT is the location of the toolchain installation - basically everything that gcc/binutils/glibc creates, plus our kernel headers and user libraries. SOURCEDIR sounds like AKAROS_ROOT, though things like kernel headers are also available in AKAROS_XCC_ROOT. So far, we mostly use AKAROS_XCC_ROOT to find the cross compiler. AKAROS_ROOT is often used for installing cross-compiled binaries into the kernel file system (e.g. $AKAROS_ROOT/kern/kfs/bin, which will end up in /bin). It seems pretty odd that your cross compiler doesn't know where to look for header files - or at least some of them? Did you move the toolchain after building it? (That seems unlikely.) You should have set the env variable X86_64_INSTDIR during toolchain installation too, usually in tools/compilers/gcc-glibc/Makelocal. If you skipped that, the build should have given you an error. But if not, that could mess things up too. (SYSROOT and install locations derive from that). One option would be to |
It looks at multiple locations under I guess I need to try to rebootstrap everything from scratch before we spend more time on this. Don't you have AKAROS_XCC_ROOT point to "$AKAROS_ROOT/install/x86_64-ucb-akaros-gcc"? If yes, than the current Makefile should work as well (provided that you run it as |
That sounds right - you should have those files in the toolchain. e.g. If you look in I don't have my AKAROS_XCC_ROOT pointing to a location inside my AKAROS_ROOT. I have it set up like this:
Actually, since I have the bin directory of XCC_ROOT in my PATH, I can run
|
btw, I just tracked down the bug(s). The main one was that under rare conditions (races with page faults on the syz-executor binary), the mm code would free a page that was in the page cache. The page would eventually get reused, which is why syz-executor would go crazy - a chunk of .text was garbage. When the page was reused, various refcnts/flags would be wrong too, which was ultimately responsible for the panic you found. (Short version: it was improperly decreffed, then when we increffed it we had a refcnt of 0, which was the panic). Anyway, I'll have a patch out later today. With it, the stress tester ran without crashing Akaros. |
I am getting the following crashes. Is it a know issue? If not and you don't see why it happens right away, I can try to create a reproducer. Checkout is on 6344ed0.
The text was updated successfully, but these errors were encountered: