Skip to content

Inclusive ranges (..=) slower than exclusive range + 1 (..n + 1) #131333

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
blyxyas opened this issue Oct 6, 2024 · 1 comment
Closed

Inclusive ranges (..=) slower than exclusive range + 1 (..n + 1) #131333

blyxyas opened this issue Oct 6, 2024 · 1 comment
Labels
C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@blyxyas
Copy link
Member

blyxyas commented Oct 6, 2024

I'm not sure why this happens, but ..= generates more instructions than ..n + 1. We should figure out which one is more performant and make both equal. (This question sprouted from rust-lang/rust-clippy#8317)

Godbolt

..n + 1
foo:
        cmp     edi, -1
        je      .LBB0_4
        push    rbp
        push    r15
        push    r14
        push    r13
        push    r12
        push    rbx
        sub     rsp, 72
        mov     ebx, edi
        inc     ebx
        xor     ebp, ebp
        lea     r13, [rip + .L__unnamed_1]
        lea     r15, [rsp + 8]
        lea     r14, [rsp + 24]
        mov     r12, qword ptr [rip + std::io::stdio::_print::h560473ec6c8cb59b@GOTPCREL]
.LBB0_2:
        mov     dword ptr [rsp + 4], ebp
        inc     ebp
        lea     rax, [rsp + 4]
        mov     qword ptr [rsp + 8], rax
        mov     rax, qword ptr [rip + core::fmt::num::imp::<impl core::fmt::Display for u32>::fmt::hc9449d7f3b1b8610@GOTPCREL]
        mov     qword ptr [rsp + 16], rax
        mov     qword ptr [rsp + 24], r13
        mov     qword ptr [rsp + 32], 2
        mov     qword ptr [rsp + 56], 0
        mov     qword ptr [rsp + 40], r15
        mov     qword ptr [rsp + 48], 1
        mov     rdi, r14
        call    r12
        cmp     ebx, ebp
        jne     .LBB0_2
        add     rsp, 72
        pop     rbx
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        pop     rbp
.LBB0_4:
        ret

.L__unnamed_2:
        .byte   10

.L__unnamed_1:
        .quad   1
        .zero   8
        .quad   .L__unnamed_2
        .asciz  "\001\000\000\000\000\000\000"
..=n
foo:
        push    rbp
        push    r15
        push    r14
        push    r13
        push    r12
        push    rbx
        sub     rsp, 72
        mov     ebx, edi
        xor     ebp, ebp
        lea     r15, [rsp + 8]
        lea     r14, [rsp + 24]
        mov     r12, qword ptr [rip + std::io::stdio::_print::h560473ec6c8cb59b@GOTPCREL]
.LBB0_1:
        mov     r13d, ebp
        cmp     ebp, ebx
        adc     ebp, 0
        mov     dword ptr [rsp + 4], r13d
        lea     rax, [rsp + 4]
        mov     qword ptr [rsp + 8], rax
        mov     rax, qword ptr [rip + core::fmt::num::imp::<impl core::fmt::Display for u32>::fmt::hc9449d7f3b1b8610@GOTPCREL]
        mov     qword ptr [rsp + 16], rax
        lea     rax, [rip + .L__unnamed_1]
        mov     qword ptr [rsp + 24], rax
        mov     qword ptr [rsp + 32], 2
        mov     qword ptr [rsp + 56], 0
        mov     qword ptr [rsp + 40], r15
        mov     qword ptr [rsp + 48], 1
        mov     rdi, r14
        call    r12
        cmp     r13d, ebx
        jae     .LBB0_3
        cmp     ebp, ebx
        jbe     .LBB0_1
.LBB0_3:
        add     rsp, 72
        pop     rbx
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        pop     rbp
        ret

.L__unnamed_2:
        .byte   10

.L__unnamed_1:
        .quad   1
        .zero   8
        .quad   .L__unnamed_2
        .asciz  "\001\000\000\000\000\000\000"
@blyxyas blyxyas added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 6, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 6, 2024
@saethlin saethlin closed this as not planned Won't fix, can't repro, duplicate, stale Oct 6, 2024
@saethlin
Copy link
Member

saethlin commented Oct 6, 2024

This is a duplicate of #45222.

@saethlin saethlin added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 6, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants