included range in for loop gives more assembly code than excluded range #102462

melonges · 2022-09-29T09:37:42Z

Given the following code where source will be excluded

pub fn store(source: &i32, dest: &mut i32) -> i32 {
    let mut accum = 0;
    for _ in 0..*source {  // source will be excluded
        accum += *dest;
    }
    accum
}

The assembly output is:

example::store:
        mov     ecx, dword ptr [rdi]
        mov     edx, dword ptr [rsi]
        imul    edx, ecx
        xor     eax, eax
        test    ecx, ecx
        cmovg   eax, edx
        ret

if source will be included the assembly code will increase more than

pub fn store(source: &i32, dest: &mut i32) -> i32 {
    let mut accum = 0;
    for _ in 0..=*source { //  source will be included
        accum += *dest;
    }
    accum
}

The assembly output is:

example::store:
        mov     ecx, dword ptr [rdi]
        test    ecx, ecx
        js      .LBB0_1
        mov     r8d, dword ptr [rsi]
        xor     eax, eax
        xor     esi, esi
.LBB0_3:
        xor     edi, edi
        cmp     esi, ecx
        setl    dl
        add     eax, r8d
        cmp     esi, ecx
        jge     .LBB0_5
        mov     dil, dl
        add     esi, edi
        cmp     esi, ecx
        jle     .LBB0_3
.LBB0_5:
        ret
.LBB0_1:
        xor     eax, eax
        ret

I tested on rustc 1.64.0 with -C opt-level=3

The text was updated successfully, but these errors were encountered:

Rageking8 · 2022-09-29T11:27:44Z

@rustbot label -A-diagnostics +A-codegen +I-heavy

lukas-code · 2022-09-29T11:58:23Z

This regressed between 1.42 and 1.43: Godbolt link

related: #45222

leonardo-m · 2022-09-30T08:39:14Z

(If you have two nested loop, the code is bad even in older compilers).

jdahlstrom · 2022-10-01T22:04:42Z

This regressed between 1.42 and 1.43: Godbolt link

The 1.42 codegen is tricky, as it seems at first that it doesn't account for the case where *source == i32::MAX and just goes add eax, 1. But it ends up giving the expected result, which is 0 (mod 2^31) no matter the value of *dest. But in many other cases inclusive ranges necessarily produce suboptimal code because the T::MAX case requires special handling.

nikic · 2022-10-01T22:32:03Z

Simpler example without the unnecessary memory indirections: https://rust.godbolt.org/z/GPvrbvbYY

External iteration over inclusive ranges is well known to optimize badly. Optimization may be viable in this particular case though.

nikic · 2022-12-21T14:20:00Z

This is the fold we'd need: https://alive2.llvm.org/ce/z/JsWRvT

ilyvion · 2023-12-22T06:12:36Z

Came across this weirdness today; it's still a problem in December 2023:

pub fn inclusive() -> usize {
    (2..=100).step_by(2).sum::<usize>()
}

pub fn exclusive() -> usize {
    (2..101).step_by(2).sum::<usize>()
}

produces assembly with 46 lines and 3 lines, respectively. Was going to create a new bug for it, but I'm pretty sure it's just another version of this one.

veera-sivarajan · 2024-11-22T03:34:47Z

@rustbot claim

RodBurman · 2025-03-16T19:11:20Z

The code snippet:

pub fn inclusive() -> usize {
     (2..=100).step_by(2).sum::<usize>()
}
 
 pub fn exclusive() -> usize {
     (2..101).step_by(2).sum::<usize>()
 }

compiled with:

% rustc -v -V 
rustc 1.85.0 (4d91de4e4 2025-02-17)
binary: rustc
commit-hash: 4d91de4e48198da2e33413efdcd9cd2cc0c46688
commit-date: 2025-02-17
host: aarch64-apple-darwin
release: 1.85.0
LLVM version: 19.1.7

Produces functions that both take up 25 lines of assembler (not necessarily 25 instructions). So no diference between inclusive and exclusive ranges. I am unsure if this is the update to rust 1.85.0, the move to an ARM processor or to MacOS.

melonges added A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 29, 2022

rustbot added A-codegen Area: Code generation I-heavy Issue: Problems and improvements with respect to binary size of generated code. and removed A-diagnostics Area: Messages for errors, warnings, and lints labels Sep 29, 2022

nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Oct 1, 2022

rustbot assigned veera-sivarajan Nov 22, 2024

veera-sivarajan mentioned this issue Dec 4, 2024

[InstSimplify] Fold X < Y ? (X + zext(X < Y)) <= Y : false to X < Y llvm/llvm-project#118579

Open

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

included range in for loop gives more assembly code than excluded range #102462

included range in for loop gives more assembly code than excluded range #102462

melonges commented Sep 29, 2022 •

edited by rustbot

Loading

Rageking8 commented Sep 29, 2022

Uh oh!

lukas-code commented Sep 29, 2022

Uh oh!

leonardo-m commented Sep 30, 2022

Uh oh!

jdahlstrom commented Oct 1, 2022 •

edited

Loading

Uh oh!

nikic commented Oct 1, 2022

Uh oh!

nikic commented Dec 21, 2022

Uh oh!

ilyvion commented Dec 22, 2023

Uh oh!

veera-sivarajan commented Nov 22, 2024

Uh oh!

RodBurman commented Mar 16, 2025

Uh oh!

included range in for loop gives more assembly code than excluded range #102462

included range in for loop gives more assembly code than excluded range #102462

Comments

melonges commented Sep 29, 2022 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rageking8 commented Sep 29, 2022

Uh oh!

lukas-code commented Sep 29, 2022

Uh oh!

leonardo-m commented Sep 30, 2022

Uh oh!

jdahlstrom commented Oct 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Oct 1, 2022

Uh oh!

nikic commented Dec 21, 2022

Uh oh!

ilyvion commented Dec 22, 2023

Uh oh!

veera-sivarajan commented Nov 22, 2024

Uh oh!

RodBurman commented Mar 16, 2025

Uh oh!

melonges commented Sep 29, 2022 •

edited by rustbot

Loading

jdahlstrom commented Oct 1, 2022 •

edited

Loading