Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

impl Ord for bool: optimization #66780

Closed
carado opened this issue Nov 26, 2019 · 3 comments · Fixed by #66881
Closed

impl Ord for bool: optimization #66780

carado opened this issue Nov 26, 2019 · 3 comments · Fixed by #66881
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@carado
Copy link

carado commented Nov 26, 2019

Currently, impl Ord for bool converts the bools to u8s and then compares them.

On the playground,

pub fn using_ord(x: bool, y: bool) -> Ordering {
    x.cmp(&y)
}

compiles to

playground::using_ord: # @playground::using_ord
# %bb.0:
	movl	%edi, %ecx
	xorl	%esi, %ecx
	testl	%esi, %esi
	movl	$255, %eax
	cmovel	%ecx, %eax
	testl	%edi, %edi
	cmovnel	%ecx, %eax
                                        # kill: def $al killed $al killed $eax
	retq
                                        # -- End function

A much shorter (in instruction count) and almost certainly faster version taking advantage of the fact that orderings are represented by i8 values -1, 0, and 1, is:

pub fn minus_transmute(x: bool, y: bool) -> Ordering {
    unsafe { std::mem::transmute(x as i8 - y as i8) }
}

which on the playground compiles to

playground::minus_transmute: # @playground::minus_transmute
# %bb.0:
	movl	%edi, %eax
	subb	%sil, %al
                                        # kill: def $al killed $al killed $eax
	retq
                                        # -- End function
@jonas-schievink jonas-schievink added I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Nov 26, 2019
@alex
Copy link
Member

alex commented Nov 27, 2019

FWIW, you get identical codegen with a match statement (+ an unreachable_unchecked), which I find slightly more readable than the transmute:

pub fn minus_match(x: bool, y: bool) -> Ordering {
    match x as i8 - y as i8 {
        -1 => Ordering::Less,
        0 => Ordering::Equal,
        1 => Ordering::Greater,
        _ => unsafe { std::hint::unreachable_unchecked() }
    }
}

@carado
Copy link
Author

carado commented Nov 27, 2019

Fair enough; that does seem like it relies on less assumptions about "the outside", too.

@krishna-veerareddy
Copy link
Contributor

I would like to work on this issue unless somebody else already started working on it.

@bors bors closed this as completed in 830b4ee Dec 11, 2019
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Aug 15, 2023
…r=cuviper

Optimizing the rest of bool's Ord implementation

After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code.
Fixes issue rust-lang#114653

[Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j)

`max` assembly before:
```assembly
example::max:
        mov     eax, edi
        mov     ecx, eax
        neg     cl
        mov     edx, esi
        not     dl
        cmp     dl, cl
        cmove   eax, esi
        ret
```
`max` assembly after:
```assembly
example::max:
        mov     eax, edi
        or      eax, esi
        ret
```
`clamp` assembly before:
```assembly
example::clamp:
        mov     eax, esi
        sub     al, dl
        inc     al
        cmp     al, 2
        jae     .LBB1_1
        mov     eax, edi
        sub     al, sil
        movzx   ecx, dil
        sub     dil, dl
        cmp     dil, 1
        movzx   edx, dl
        cmovne  edx, ecx
        cmp     al, -1
        movzx   eax, sil
        cmovne  eax, edx
        ret
.LBB1_1:
        ; identical assert! code
```
`clamp` assembly after:
```assembly
example::clamp:
        test    edx, edx
        jne     .LBB1_2
        test    sil, sil
        jne     .LBB1_3
.LBB1_2:
        or      dil, sil
        and     dil, dl
        mov     eax, edi
        ret
.LBB1_3:
        ; identical assert! code
```
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 16, 2023
…r=cuviper

Optimizing the rest of bool's Ord implementation

After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code.
Fixes issue rust-lang#114653

[Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j)

`max` assembly before:
```assembly
example::max:
        mov     eax, edi
        mov     ecx, eax
        neg     cl
        mov     edx, esi
        not     dl
        cmp     dl, cl
        cmove   eax, esi
        ret
```
`max` assembly after:
```assembly
example::max:
        mov     eax, edi
        or      eax, esi
        ret
```
`clamp` assembly before:
```assembly
example::clamp:
        mov     eax, esi
        sub     al, dl
        inc     al
        cmp     al, 2
        jae     .LBB1_1
        mov     eax, edi
        sub     al, sil
        movzx   ecx, dil
        sub     dil, dl
        cmp     dil, 1
        movzx   edx, dl
        cmovne  edx, ecx
        cmp     al, -1
        movzx   eax, sil
        cmovne  eax, edx
        ret
.LBB1_1:
        ; identical assert! code
```
`clamp` assembly after:
```assembly
example::clamp:
        test    edx, edx
        jne     .LBB1_2
        test    sil, sil
        jne     .LBB1_3
.LBB1_2:
        or      dil, sil
        and     dil, dl
        mov     eax, edi
        ret
.LBB1_3:
        ; identical assert! code
```
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants