-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
cpu/cortexm: add __NOP(); after __WFI(); for stm32l152 to avoid hardfault #8518
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo found
* 1: external crystal available (always 32.768kHz) | ||
* | ||
* LSE might not be available by default in early (C-01) Nucleo boards. | ||
* If you're sure it is present, define CLOCL_LSE=1 in your project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CLOCK_LSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, addressed.
I just tested the shell with
|
Which boards are you testing? |
Sorry, it was not clear: I have nucleo-l152 rev c-03 |
Can you try to debug and see if it doesn't get stuck here: 0x08001530 in stmclk_enable_lfclk () at /Users/facosta/git/RIOT-OS/RIOT/cpu/stm32_common/stmclk_common.c:79
79 while (!(RCC->REG_LSE & BIT_LSERDY)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please split the PR's. They're dealing with unrelated issues...
cpu/cortexm_common/include/cpu.h
Outdated
@@ -104,7 +104,8 @@ static inline void cortexm_sleep(int deep) | |||
} | |||
|
|||
/* ensure that all memory accesses have completed and trigger sleeping */ | |||
unsigned state = irq_disable(); | |||
/* avoid state to be stored in r0 (causes fault in some platforms) */ | |||
volatile unsigned state = irq_disable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make this optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no problem on that, however I cannot see either why this is harmful, and the difference on size is only 4 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, it is also an extra memory access (vs. register) on every sleep. Don't know if that matters.
Any other opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just makes the variable stored in the stack, so its less than a function call.
0f14567
to
f8fa21e
Compare
Done. |
I would prefer if we could find out why r0 is lost on resume from sleep. Is there a hardware issue with these CPUs or is there a bug in the implementation of one of the ISRs in the periph drivers? |
Definitely... But seeing the time that has been put into debugging this already, IMO it is fine to go with a workaround for now. Still, it should be documented as such (and not as necessary solution to a problem) and only be enabled for affected platforms. |
As fas as I can tell any interrupt would cause the hardfault (e.g. hello-world example "works", but if you type something in the terminal it hardfaults), thus I'd discard the possibility of a faulty periph implementation. I'd also like to get #8402 in since it also helps in this situation, actually I thought it would solve the original problem but unfortunately not. The same with #8403. I can debug more to see the "real" source, if there's one, but for now I'd like to have the platform working again before doing "major" reworking on clock and pm. |
I suspect it's something to do with the power modes which maybe need to be configured before going to sleep. In this situation it "wildly" goes to sleep so I'd expect an undefined behaviour of registers and peripherals, here maybe we are just observing a part of it. Thus the importance of #8403 and #8402 . |
I also insist on investigating #8024 (comment) |
It also solves the hard fault? |
No, that's why I came to this fix. |
What exactly does this mean? Which situation, why "wildly"? |
What I mean is that currently we put all cortexes to sleep regardless of the implementation of I'll come with a more formal explanation why r0 is lost in this case (IMHO due to the lack of configuration before sleep) later in other issue or this thread if it's not being merged by then. For that I need to succeed to configure clock/pm as needed and experiment with it. |
Some findings (thanks @kaspar030) suggest that this MCU is particularly behaving "wrong" after idling or sleeping. A simple I'll change this PR to reflect the new solution which seems much more intrusive than the current one. |
;) |
@kYc0o Where exactly do you add the nop? |
Just after |
I would still like to know if it can be broken with the Could it be something like one of the instruction should be aligned on 4bytes address and is not for this platform because the compiler does crap ? Is there the same alignment requirement for instructions than for memory access ? |
I was doing some tests and didn't crash with several NOPs. Thus, I don't experience the same as the issue on the ST webpage. |
Is the r0 corruption completely random or is it always the same? |
There was no other PR, unless you mean my first attempt which I don't consider as good. I'll open a second PR after this gets merged. However that wouldn't really be the solution for your problem since it should work with the external crystal anyways. PS. I changed the description. |
Indeed, but at least this would allow people to use this board, which is not possible at the moment. |
strikethrough text is not enough I think, please change |
Ok @aabadie it seems you changed to refs, so you ACK? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you ACK?
ACK :)
Ok so there's only @kaspar030 ACK left. |
@kaspar030 do you ACK this one ? |
cpu/cortexm_common/include/cpu.h
Outdated
@@ -107,6 +107,14 @@ static inline void cortexm_sleep(int deep) | |||
unsigned state = irq_disable(); | |||
__DSB(); | |||
__WFI(); | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the comment into the ifdef, and make it a little shorter. E.g.:
/* STM32L152RE crashes without this __NOP(). See #8518. */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
ff54170
to
933f281
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
@kYc0o, please squash |
933f281
to
ac93283
Compare
Squashed. |
- The __NOP() that was added in RIOT-OS#8518 is now remooved. - When DBG_STANDBY, DBG_STOP or DBG_SLEEP are set in DBG_CR a hardfault occurs on wakeup from sleep. This was first diagnosed in RIOT-OS#8518. When enabled, a hardfault occured when returning from a branch to irq_restore() we avoid the call by inlining the function call. See #xxxxx for more details.
- The __NOP() that was added in RIOT-OS#8518 is now remooved. - When DBG_STANDBY, DBG_STOP or DBG_SLEEP are set in DBG_CR a hardfault occurs on wakeup from sleep. This was first diagnosed in RIOT-OS#8518. When enabled, a hardfault occured when returning from a branch to irq_restore() we avoid the call by inlining the function call. See RIOT-OS#11830 for more details.
- The __NOP() that was added in RIOT-OS#8518 is now remooved. - When DBG_STANDBY, DBG_STOP or DBG_SLEEP are set in DBG_CR a hardfault occurs on wakeup from sleep. This was first diagnosed in RIOT-OS#8518. When enabled, a hardfault occured when returning from a branch to irq_restore() we avoid the call by inlining the function call. See RIOT-OS#11830 for more details.
- The __NOP() that was added in RIOT-OS#8518 is now remooved. - When DBG_STANDBY, DBG_STOP or DBG_SLEEP are set in DBG_CR a hardfault occurs on wakeup from sleep. This was first diagnosed in RIOT-OS#8518. When enabled, a hardfault occured when returning from a branch to irq_restore() we avoid the call by inlining the function call. See RIOT-OS#11830 for more details.
- The __NOP() that was added in RIOT-OS#8518 is now remooved. - When DBG_STANDBY, DBG_STOP or DBG_SLEEP are set in DBG_CR a hardfault occurs on wakeup from sleep. This was first diagnosed in RIOT-OS#8518. When enabled, a hardfault occured when returning from a branch to irq_restore() we avoid the call by inlining the function call. See RIOT-OS#11830 for more details.
Contribution description
Currently, the stm32l1x hardfaults due to the irq state being stored in r0, which for some reason is lost after wake-up.
This PR fixes that by ensuring that the state is being stored in RAM.The actual, less intrusive, fix is to add a
__NOP();
just after wake up, which also solves the problem.It additionally allows to choose a different Low Speed clock source, since by default LSE (external) is hardcoded. Although this might be in another PR, users of an old revision of nucleo boards cannot test (so far one of the two stm32l1x based supported boards).Issues/PRs references
Refs #8024