-
Notifications
You must be signed in to change notification settings - Fork 13.7k
improve cold_path() #133852
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
improve cold_path() #133852
Conversation
Just found that this also works: let new_buf = Global.allocate(layout).map_err(|_| {
cold_path();
Error::new_alloc_failed("Cannot allocate memory.")
})?; As long as the closure is inlined, branch weights will be emited. |
I'm not the best person to review this, sorry r? compiler |
☔ The latest upstream changes (presumably #130060) made this pull request unmergeable. Please resolve the merge conflicts. |
r? compiler |
maybe r? @nikic 😅 |
Anything like this should be re-rolled to the codegen group (which I am in), not the compiler overall. The compiler group is very big and will often just toss around review on things a lot. And @x17jiri if you feel lost in the review process don't hesitate to reach out on the Zulip https://rust-lang.zulipchat.com/ r? saethlin I can get to this in a few days max. If someone else wants to approve it before then, feel free. |
let cold_weight = unsafe { llvm::LLVMValueAsMetadata(self.cx.const_u32(1)) }; | ||
let hot_weight = unsafe { llvm::LLVMValueAsMetadata(self.cx.const_u32(2000)) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are 1 and 2000 derived from anything in particular? If these are the magic values clang uses or something like that, a comment would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values are used by llvm.expect
for branches with 2 targets. I added a comment.
Just a few nits, and you have a merge conflict. Then I'd like to see this go through a perf run before we merge; this isn't just making |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good tests. Thank you.
@rustbot label: +perf-regression-triaged @bors r+ |
improve cold_path() rust-lang#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
It seems the test is failing because the metadata have different numbers on the |
Nope. But if you push a fix that uses a regex and works on Linux, I'll adjust the PR description to run try-job on apple and a few other platforms: https://rustc-dev-guide.rust-lang.org/tests/ci.html#try-builds |
@saethlin It should be fixed |
@bors try |
improve cold_path() rust-lang#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely` However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this: ``` if let Some(x) = y { ... } ``` may generate 3-target switch: ``` switch y.discriminator: 0 => true branch 1 = > false branch _ => unreachable ``` and therefore marking a branch as cold will have no effect. This PR improves `cold_path()` to work with arbitrary switch instructions. Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target. try-job: aarch64-apple try-job: test-various
☀️ Try build successful - checks-actions |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (3b022d8): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -1.9%, secondary 0.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 1.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.0%, secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 774.633s -> 775.584s (0.12%) |
#120370 added a new instrinsic
cold_path()
and used it to fixlikely
andunlikely
However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for
likely
andunlikely
, but limits usefulness ofcold_path
for idiomatic rust. For example, code like this:may generate 3-target switch:
and therefore marking a branch as cold will have no effect.
This PR improves
cold_path()
to work with arbitrary switch instructions.Note that for 2-target switches, we can use
llvm.expect
, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly becauseswitch
inC/C++
can have multiple cases going to the same target.