Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Add support for measuring the program's stack usage #254

Merged
merged 9 commits into from
Sep 8, 2021
Merged

Add support for measuring the program's stack usage #254

merged 9 commits into from
Sep 8, 2021

Conversation

jonas-schievink
Copy link
Contributor

@jonas-schievink jonas-schievink commented Sep 6, 2021

This cleans up the stack canary code a bit, and adds a --measure-stack command that will paint the entire available stack region with a known bit pattern, and then report how much of the available stack space was used by the program.

Closes #188

(HOST) INFO  flashing program (7.03 KiB)
(HOST) INFO  success!
(HOST) INFO  painting 261063 bytes of RAM for stack usage estimation
────────────────────────────────────────────────────────────────────────────────
0 INFO  recurse(times=10)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
1 INFO  recurse(times=9)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
2 INFO  recurse(times=8)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
3 INFO  recurse(times=7)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
4 INFO  recurse(times=6)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
5 INFO  recurse(times=5)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
6 INFO  recurse(times=4)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
7 INFO  recurse(times=3)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
8 INFO  recurse(times=2)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
9 INFO  recurse(times=1)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
10 INFO  recurse(times=0)
└─ overflow::recurse @ /home/jonas/dev/probe-test/src/bin/overflow.rs:13
────────────────────────────────────────────────────────────────────────────────
(HOST) INFO  reading 261063 bytes of RAM for stack usage estimation
(HOST) INFO  program has used at least 23240/261063 bytes of stack space
(HOST) INFO  device halted without error

@japaric japaric self-assigned this Sep 8, 2021
Copy link
Member

@japaric japaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I left some inline comments with nits around a data structure and the number representation.
Could you please open a few follow-up perf investigation issue? one for 'binary search read' and one for 'speed up canary write by having MCU do the memset'
Other than that, feel free to send this to bors 🚀

}
let address = *stack_range.start();
let canary = vec![CANARY_VALUE; size];
core.write_8(address, &canary)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this write_8 be inside the if measure_stack guard? ahh, nvm I see that the size value depends on measure_stack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be done in this PR but it would be good add e.g. trace level logs that report the time it takes to write the canary, read the canary and write the program to flash. Just so that we have some measurements in place for when we look into optimizing things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other thing that I noticed was that number of bytes reported was an odd number. now I see that CANARY_VALUE is a byte. I was kind of expecting a word since the stack pointer is usually moved 4 or 8 bytes at a time. I don't think a 1-byte canary value is a problem right now given that we are scanning the stack linearly but if we switch to binary search it may result in false positives so the sampling done in the binary search should be several bytes (maybe 24 bytes at a time or so)

if let Some(pos) = canary.iter().position(|b| *b != CANARY_VALUE) {
let touched_address = self.address + pos as u32;
log::debug!("canary was touched at {:#010X}", touched_address);
let min_stack_usage = match canary.iter().position(|b| *b != CANARY_VALUE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be addressed in this PR but this could be sped up (+) by doing a binary search in the measure_stack case. Though the above read_8 will eat most of runtime so, instead of reading the whole self.size at once, this part should sample the stack at sizeof(CANARY_VALUE) chunks at a time and use that to perform a binary search.

(+) before doing any hypothetical perf improvement let's measure how long this takes vs how long the write takes. it may be that this is negligible compared to the write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the binary search optimization is going to work correctly, since the program won't touch every word in the used area of the stack, so the search might run into a CANARY_VALUE and incorrectly skip used areas of the stack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I agree a 1-byte look-up is error prone hence binary search should read a bigger chunk at a time, see #254 (comment) . Or maybe instead of considering a single partition point to decide if one half is used or not we could use 2 or more partition points ("add more depth to the binary partition tree"). Hard to explain without a diagram; will think more about it and file an issue if what I'm thinking of even makes sense.

src/target_info.rs Outdated Show resolved Hide resolved
src/canary.rs Outdated Show resolved Hide resolved
@jonas-schievink
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 8, 2021

Build succeeded:

@bors bors bot merged commit 9800f0d into knurling-rs:main Sep 8, 2021
@jonas-schievink jonas-schievink deleted the canary-v2 branch September 8, 2021 16:02
japaric added a commit that referenced this pull request Sep 14, 2021
@japaric japaric mentioned this pull request Sep 15, 2021
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flag to report observed max stack usage
2 participants