Skip to content

armv5te hello world fails to run on qemu and pi3 #46822

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
malbarbo opened this issue Dec 18, 2017 · 19 comments
Closed

armv5te hello world fails to run on qemu and pi3 #46822

malbarbo opened this issue Dec 18, 2017 · 19 comments

Comments

@malbarbo
Copy link
Contributor

malbarbo commented Dec 18, 2017

Running (in qemu and in raspberry pi 3) a hello world binary compiled with xargo and rust nightly-2017-11-16 works as expected. Using rustc nightly-2017-11-17 it segfaults.

Here is a docker file that can be used to reproduce the problem:

FROM ubuntu:17.10

RUN apt-get update
RUN apt-get install \
    qemu-user \
    curl ca-certificates \
    make file \
    gcc libc6-dev \
    gcc-arm-linux-gnueabi libc6-dev-armel-cross \
    -y --no-install-recommends
# change to 2017-11-17 to fail
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain nightly-2017-11-16 
ENV PATH=$PATH:/root/.cargo/bin/
RUN rustup component add rust-src
RUN cargo install xargo

ENV USER=root
RUN cargo new hello --bin

RUN mkdir hello/.cargo/
RUN echo "[target.armv5te-unknown-linux-gnueabi]\nlinker = \"arm-linux-gnueabi-gcc\"" > hello/.cargo/config

RUN echo "[target.armv5te-unknown-linux-gnueabi.dependencies.std]\nfeatures = [\"force_alloc_system\"] " > hello/Xargo.toml

RUN echo "\n[profile.release]\npanic = \"abort\"" >> hello/Cargo.toml

ENV CFLAGS_armv5te_unknown_linux_gnueabi="-march=armv5te -mfloat-abi=soft" \
    CC_armv5te_unknown_linux_gnueabi=arm-linux-gnueabi-gcc

RUN cd hello && xargo build --release --target armv5te-unknown-linux-gnueabi

RUN QEMU_STRACE=1 qemu-arm -L /usr/arm-linux-gnueabi hello/target/armv5te-unknown-linux-gnueabi/release/hello

According to @Dushistov, the crash happens in __sync_val_compare_and_swap_4, but I get other result.

Edit: removed the stack trace, it was not helping and it takes to much space.

@Dushistov
Copy link
Contributor

@malbarbo

According to @Dushistov, the crash happens in __sync_val_compare_and_swap_4
And here is a strace running in raspberry pi 3:

But strace just show system calls it doesn't show crash place, or I missed something?

If you have no gdb on the board (this is my case), you can run ulimit -c unlimited
and then look at the core.SOME_PID files generated by crash under gdb from cross toolchain on your PC machine.

@malbarbo
Copy link
Contributor Author

@Dushistov You are right. I didn't want to mean that strace show the crash place, sorry.

@malbarbo
Copy link
Contributor Author

Running in gdb I get the following backtrace:

#0  0x00000000 in ?? ()
#1  0x0040f800 in __sync_fetch_and_add_4 ()
#2  0x004047dc in std::io::stdio::stdout::hf90edfc5dc4b8a28 ()
#3  0x00405028 in std::io::stdio::_print::h73be5c1a0e336538 ()
#4  0x00401dc0 in hello::main () at src/main.rs:2
#5  0x004087bc in __rust_maybe_catch_panic ()
#6  0x00408230 in std::rt::lang_start::h62f49e8260dc865f ()
#7  0x00401e28 in main ()

Here is the disassemble output of __sync_fetch_and_add_4:

Dump of assembler code for function __sync_fetch_and_add_4:
   0x0000f7e0 <+0>:     push    {r4, r5, r6, lr}
   0x0000f7e4 <+4>:     mov     r4, r1
   0x0000f7e8 <+8>:     mov     r2, r0
   0x0000f7ec <+12>:    ldr     r5, [r2]
   0x0000f7f0 <+16>:    ldr     r6, [pc, #24]   ; 0xf810 <__sync_fetch_and_add_4+48>
   0x0000f7f4 <+20>:    add     r1, r5, r4
   0x0000f7f8 <+24>:    mov     r0, r5
   0x0000f7fc <+28>:    blx     r0
   0x0000f800 <+32>:    cmp     r0, #0
   0x0000f804 <+36>:    bne     0xf7ec <__sync_fetch_and_add_4+12>
   0x0000f808 <+40>:    mov     r0, r5
   0x0000f80c <+44>:    pop     {r4, r5, r6, pc}
   0x0000f810 <+48>:                    ; <UNDEFINED> instruction: 0xffff0fc0

Here is the disassemble for the working version:

Dump of assembler code for function __sync_fetch_and_add_4:
   0x0000fc18 <+0>:     push    {r4, r5, r6, r7, r8, lr}
   0x0000fc1c <+4>:     mov     r5, r0
   0x0000fc20 <+8>:     mov     r7, r1
   0x0000fc24 <+12>:    ldr     r6, [pc, #40]   ; 0xfc54 <__sync_fetch_and_add_4+60>
   0x0000fc28 <+16>:    ldr     r4, [r5]
   0x0000fc2c <+20>:    mov     r2, r5
   0x0000fc30 <+24>:    add     r1, r4, r7
   0x0000fc34 <+28>:    mov     r0, r4
   0x0000fc38 <+32>:    mov     lr, pc
   0x0000fc3c <+36>:    bx      r6
   0x0000fc40 <+40>:    cmp     r0, #0
   0x0000fc44 <+44>:    bne     0xfc28 <__sync_fetch_and_add_4+16>
   0x0000fc48 <+48>:    mov     r0, r4
   0x0000fc4c <+52>:    pop     {r4, r5, r6, r7, r8, lr}
   0x0000fc50 <+56>:    bx      lr
   0x0000fc54 <+60>:                    ; <UNDEFINED> instruction: 0xffff0fc0

Maybe @jamesmunns can help with this?

@jamesmunns
Copy link
Member

Hey @malbarbo, a bit of quick background information:

The armv5te architecture is a little weird for an ARM architecture, as the CPU has no direct CPU level atomic instructions (excluding the use of Arc<_> or anything that relies on it, which is a good portion of std). Linux provides a "workaround" for this by allowing OS level atomic instructions which provides a slower, but usable set of atomic operations in the 0xffffxxxx address space. There is some background information on my PR, as well as this kernel.org document.

I unfortunately am not aware whether non-armv5te kernels contain this "feature", but if not, these crashes would make sense to me, since the kernel is not stepping in to catch these shims.

I also do not know what the issue is here between the different versions of Rust.

I am currently running your dockerfile with the 2017-11-17 version of Rust, and I will extract the binary and run on some actual armv5te hardware, and see if I can repro for you.

I don't think this helps much, but I'll re-read this later when I have more time, and I am happy to explain anything I can. Feel free to re-ping me or ask questions.

@parched
Copy link
Contributor

parched commented Dec 19, 2017

Looks like the problem is the previous code is from libgcc but the new code is from rust-lang/compiler-builtins#115.

cc @Amanieu

@jamesmunns
Copy link
Member

I can confirm the -17 version also fails on real armv5te hardware:

# uname -a
Linux 4.4.24 #1 Fri Nov 24 13:37:38 UTC 2017 armv5tejl GNU/Linux
# ls -hal ./hello
-rwxr-xr-x    1 root     root      129.2K Dec 19 18:02 ./hello
# ./hello
Illegal instruction
#

My ARM assembly is a bit weak, so I may not be able to help @Amanieu, but I am happy to test any potential changes on my side (and with my hardware).

@malbarbo
Copy link
Contributor Author

Thanks @jamesmunns and @parched for taking a look at this.

@Amanieu
Copy link
Member

Amanieu commented Dec 19, 2017

@jamesmunns Could you try running the program in gdb and disassembling the function in which the crash occurs?

(gdb) run
(gdb) backtrace
(gdb) disas

@parched
Copy link
Contributor

parched commented Dec 19, 2017

0x0000f7fc <+28>:    blx     r0

This can't be right, can it? r0 is one of the arguments to that call? I wonder if LLVM is mucking up the register allocation for the inline assembly.

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

@parched The blx instruction should be supported on all ARMv5 systems. Are you sure your hardware isn't ARMv4 by any chance? You are getting the same Illegal instruction crash as @jamesmunns, right?

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

@parched Actually, you are right, there does seem to be something wrong with the register allocation. I'll look into it.

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

See rust-lang/compiler-builtins#218

@jamesmunns
Copy link
Member

Hey @Amanieu, I don't currently have a build for my device that has gdb enabled on that device (I'm not actively working on that device at the moment), do you still need this, or should I wait for the next nightly to come through to test the changes introduced by rust-lang/compiler-builtins#218 ?

Let me know, if you still need it, I will set up a new build and image for my device with gdb, etc.

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

@jamesmunns You should wait for the next nightly. The previous code was definitely broken.

@malbarbo
Copy link
Contributor Author

@Amanieu Thanks for working on this. I made a PR updating compiler_builtins crate.

kennytm added a commit to kennytm/rust that referenced this issue Dec 21, 2017
@green-s
Copy link
Contributor

green-s commented Dec 22, 2017

Should the latest nightly have fixed this (250b492 2017-12-21)? I'm still getting a segfault.

@arielb1
Copy link
Contributor

arielb1 commented Dec 22, 2017

@green-s
Copy link
Contributor

green-s commented Dec 23, 2017

Tested the new nightly (5165ee9) in QEMU and on-device. It now successfully prints hello world but segfaults afterwards.

(gdb) run
Starting program: /home/sam/hello-world
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
Hello, world!

Program received signal SIGSEGV, Segmentation fault.
0x00410644 in __sync_val_compare_and_swap_4 ()
(gdb) backtrace
#0  0x00410644 in __sync_val_compare_and_swap_4 ()
#1  0x00407ec4 in std::rt::lang_start::hc79ba98377dc1008 ()
#2  0xb6e1c2cc in __libc_start_main (main=0xbefffa74, argc=-1225486336, argv=0xb6e1c2cc <__libc_start_main+280>, init=<optimized out>,
    fini=0x410b48 <__libc_csu_fini>, rtld_fini=0xb6fdfc60 <_dl_fini>, stack_end=0xbefffa74) at libc-start.c:287
#3  0x00401d1c in _start ()
(gdb) disas
Dump of assembler code for function __sync_val_compare_and_swap_4:
   0x00410634 <+0>:     push    {r4, r5, r6, lr}
   0x00410638 <+4>:     mov     r4, r2
   0x0041063c <+8>:     mov     r6, r1
   0x00410640 <+12>:    mov     r5, r0
=> 0x00410644 <+16>:    ldr     r0, [r4]
   0x00410648 <+20>:    cmp     r0, r5
   0x0041064c <+24>:    popne   {r4, r5, r6, pc}
   0x00410650 <+28>:    ldr     r3, [pc, #28]   ; 0x410674 <__sync_val_compare_and_swap_4+64>
   0x00410654 <+32>:    mov     r0, r5
   0x00410658 <+36>:    mov     r1, r6
   0x0041065c <+40>:    mov     r2, r4
   0x00410660 <+44>:    blx     r3
   0x00410664 <+48>:    cmp     r0, #0
   0x00410668 <+52>:    bne     0x410644 <__sync_val_compare_and_swap_4+16>
   0x0041066c <+56>:    mov     r0, r5
   0x00410670 <+60>:    pop     {r4, r5, r6, pc}
   0x00410674 <+64>:                    ; <UNDEFINED> instruction: 0xffff0fc0
End of assembler dump.

@Amanieu
Copy link
Member

Amanieu commented Dec 23, 2017

bors added a commit to rust-lang/compiler-builtins that referenced this issue Dec 23, 2017
bors added a commit that referenced this issue Dec 25, 2017
bors added a commit that referenced this issue Dec 27, 2017
Add dist builder for armv5te-unknown-linux-gnueabi (again)

The dist builder was first add in #46498 and later remove in #46498 because of #46822.

#46822 seems to be fixed now (I and @green-s have [tested](#46498 (comment)) it).
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants