-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
impl Ord for bool: optimization #66780
Labels
C-enhancement
Category: An issue proposing an enhancement or a PR with one.
I-slow
Issue: Problems and improvements with respect to performance of generated code.
T-libs-api
Relevant to the library API team, which will review and decide on the PR/issue.
Comments
FWIW, you get identical codegen with a pub fn minus_match(x: bool, y: bool) -> Ordering {
match x as i8 - y as i8 {
-1 => Ordering::Less,
0 => Ordering::Equal,
1 => Ordering::Greater,
_ => unsafe { std::hint::unreachable_unchecked() }
}
} |
Fair enough; that does seem like it relies on less assumptions about "the outside", too. |
I would like to work on this issue unless somebody else already started working on it. |
GuillaumeGomez
added a commit
to GuillaumeGomez/rust
that referenced
this issue
Aug 15, 2023
…r=cuviper Optimizing the rest of bool's Ord implementation After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code. Fixes issue rust-lang#114653 [Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j) `max` assembly before: ```assembly example::max: mov eax, edi mov ecx, eax neg cl mov edx, esi not dl cmp dl, cl cmove eax, esi ret ``` `max` assembly after: ```assembly example::max: mov eax, edi or eax, esi ret ``` `clamp` assembly before: ```assembly example::clamp: mov eax, esi sub al, dl inc al cmp al, 2 jae .LBB1_1 mov eax, edi sub al, sil movzx ecx, dil sub dil, dl cmp dil, 1 movzx edx, dl cmovne edx, ecx cmp al, -1 movzx eax, sil cmovne eax, edx ret .LBB1_1: ; identical assert! code ``` `clamp` assembly after: ```assembly example::clamp: test edx, edx jne .LBB1_2 test sil, sil jne .LBB1_3 .LBB1_2: or dil, sil and dil, dl mov eax, edi ret .LBB1_3: ; identical assert! code ```
matthiaskrgr
added a commit
to matthiaskrgr/rust
that referenced
this issue
Aug 16, 2023
…r=cuviper Optimizing the rest of bool's Ord implementation After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code. Fixes issue rust-lang#114653 [Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j) `max` assembly before: ```assembly example::max: mov eax, edi mov ecx, eax neg cl mov edx, esi not dl cmp dl, cl cmove eax, esi ret ``` `max` assembly after: ```assembly example::max: mov eax, edi or eax, esi ret ``` `clamp` assembly before: ```assembly example::clamp: mov eax, esi sub al, dl inc al cmp al, 2 jae .LBB1_1 mov eax, edi sub al, sil movzx ecx, dil sub dil, dl cmp dil, 1 movzx edx, dl cmovne edx, ecx cmp al, -1 movzx eax, sil cmovne eax, edx ret .LBB1_1: ; identical assert! code ``` `clamp` assembly after: ```assembly example::clamp: test edx, edx jne .LBB1_2 test sil, sil jne .LBB1_3 .LBB1_2: or dil, sil and dil, dl mov eax, edi ret .LBB1_3: ; identical assert! code ```
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Labels
C-enhancement
Category: An issue proposing an enhancement or a PR with one.
I-slow
Issue: Problems and improvements with respect to performance of generated code.
T-libs-api
Relevant to the library API team, which will review and decide on the PR/issue.
Currently,
impl Ord for bool
converts thebool
s tou8
s and then compares them.On the playground,
compiles to
A much shorter (in instruction count) and almost certainly faster version taking advantage of the fact that orderings are represented by
i8
values -1, 0, and 1, is:which on the playground compiles to
The text was updated successfully, but these errors were encountered: