Skip to content
This repository was archived by the owner on Jan 30, 2024. It is now read-only.

stack trace infinite loop #46

Closed
kaspar030 opened this issue Sep 1, 2020 · 7 comments · Fixed by #50
Closed

stack trace infinite loop #46

kaspar030 opened this issue Sep 1, 2020 · 7 comments · Fixed by #50
Labels
type: bug Something isn't working

Comments

@kaspar030
Copy link

I'm just getting started using probe-run, testing a threading crate I'm building.

So (on Cortex-M), the thread stacks are populated with a frame like an exception would do, [r0-r4, r12, LR, PC, APSR].

LR is set to the location of a cleanup(), which does some house-keeping, then triggers the scheduler to switch away, then does a loop {}, which is practically unreachable.

Now when I run something with probe-run, the resulting stack trace endlessly repeats in cleanup():

[...]
    Finished dev [optimized + debuginfo] target(s) in 0.01s              
     Running `probe-run --chip nRF52840_xxAA /home/kaspar/src/own/rust/riot_core/target/thumbv7em-none-eabi/debug/examples/bench_lock`             
flashing program ..                                                      
^[[CDONE                                                                 
resetting device                                                         
stack backtrace:                                                         
   0: 0x00001752 - <unknown>                                             
   1: 0x000016b6 - cortex_m_semihosting::hio::hstdout
   2: 0x00001650 - cortex_m_semihosting::export::hstdout_fmt::{{closure}}                                                                          
   3: 0x00001558 - cortex_m::interrupt::free                             
   4: 0x00001638 - cortex_m_semihosting::export::hstdout_fmt
   5: 0x000007b6 - user_main                                             
   6: 0x0000112a - riot_core::main_trampoline                            
   7: 0x00000d20 - riot_core::thread::cleanup                                                                                                      
   8: 0x00000d20 - riot_core::thread::cleanup                            
   9: 0x00000d20 - riot_core::thread::cleanup       
  10: 0x00000d20 - riot_core::thread::cleanup                            
  11: 0x00000d20 - riot_core::thread::cleanup            
  12: 0x00000d20 - riot_core::thread::cleanup
  13: 0x00000d20 - riot_core::thread::cleanup
[...]

The lines <N>: 0x00000d20 - riot_core::thread::cleanup repeat extremely fast, and the process cannot be cancelled with CTRL-C.

While I might have messed up something on the stack to make it invalid and look like this to probe-run, I'd expect it to stop after maybe the second iteration, letting the user know that this would repeat.

(I'm assuming that probe-run's rtt support uses a different BKPT scheme than traditional semihosting, and that's why this exits on the first hprintln!(), but that is a unrelated to this issue.)

This is on probe-run 0.1.3 installed today via "cargo install probe-run".

@jonas-schievink jonas-schievink added the type: bug Something isn't working label Sep 1, 2020
@jonas-schievink
Copy link
Contributor

Thanks for the report! #47 should fix the behavior of Ctrl+C while the backtrace is being generated. To investigate the broken backtrace, it would be great if you could provide the source code of riot_core.

@japaric
Copy link
Member

japaric commented Sep 1, 2020

7: 0x00000d20 - riot_core::thread::cleanup
8: 0x00000d20 - riot_core::thread::cleanup

in these situations we should check both the PC (printed) and SP (not printed); if both haven't changed in one iteration then the virtual unwinder has gotten stuck and we should stop it and print an error message. (I thought there was already some logic to check for this condition but maybe not)

@jonas-schievink
Copy link
Contributor

We do have such a check, but it seems like that isn't enough:

probe-run/src/main.rs

Lines 579 to 582 in e1964b8

if !cfa_changed && lr == pc {
println!("error: the stack appears to be corrupted beyond this point");
return Ok(top_exception);
}

@jonas-schievink
Copy link
Contributor

@kaspar030 It would be great if you could re-run this on the current main branch (which includes #48), with RUST_LOG=probe_run=debug set. That should print some more information we can use to diagnose what's happening.

@kaspar030
Copy link
Author

@kaspar030 It would be great if you could re-run this on the current main branch (which includes #48), with RUST_LOG=probe_run=debug set. That should print some more information we can use to diagnose what's happening.

Done, here's the end of the output:

stack backtrace:                                                                                                                   
   0: 0x0000139e - <unknown>                                                                                                       
[2020-09-01T20:49:41Z DEBUG probe_run] update_cfa: CFA changed Some(2000039c) -> 200003bc                                          
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00001303 pc=0x0000139e                                                                 
   1: 0x00001302 - cortex_m_semihosting::hio::hstdout                                                                              
[2020-09-01T20:49:41Z DEBUG probe_run] update_cfa: CFA changed Some(200003bc) -> 200003c4                                          
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00001245 pc=0x00001302                                                                 
   2: 0x00001244 - cortex_m_semihosting::export::hstdout_str::{{closure}}                                                          
[2020-09-01T20:49:41Z DEBUG probe_run] update_cfa: CFA changed Some(200003c4) -> 200003d4                                          
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00001177 pc=0x00001244                                                                 
   3: 0x00001176 - cortex_m::interrupt::free                                                                                       
[2020-09-01T20:49:41Z DEBUG probe_run] update_cfa: CFA changed Some(200003d4) -> 200003e4                                          
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00001231 pc=0x00001176                                                                 
   4: 0x00001230 - cortex_m_semihosting::export::hstdout_str                                                                       
[2020-09-01T20:49:41Z DEBUG probe_run] update_cfa: CFA changed Some(200003e4) -> 200003f4                                          
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00000617 pc=0x00001230                                                                 
   5: 0x00000616 - app::func                                                                                                       
[2020-09-01T20:49:41Z DEBUG probe_run] update_cfa: CFA changed Some(200003f4) -> 200003fc                                          
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00000aa5 pc=0x00000616                                                                 
   6: 0x00000aa4 - riot_core::thread::cleanup
[2020-09-01T20:49:41Z DEBUG probe_run] lr=0x00000aa5 pc=0x00000aa4
   7: 0x00000aa4 - riot_core::thread::cleanup

I guess the thumb bit needs to be ignored in the check that's mentioned above.

@jonas-schievink
Copy link
Contributor

Yeah that sounds like the right fix. Can you try with #50 to see if that fixes it?

@kaspar030
Copy link
Author

Thanks a bunch!

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants