Skip to content

missed optimization for if let, even though equivalent match is optimized #110097

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
davemilter opened this issue Apr 8, 2023 · 3 comments · Fixed by #120268
Closed

missed optimization for if let, even though equivalent match is optimized #110097

davemilter opened this issue Apr 8, 2023 · 3 comments · Fixed by #120268
Assignees
Labels
A-mir-opt Area: MIR optimizations C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@davemilter
Copy link

davemilter commented Apr 8, 2023

I tried this code:

pub fn change_value(val: Result<i32, ()>) -> Result<i32, ()> {
    if let Ok(x) = val {
        Ok(x * 2)
    } else {
        Err(())
    }
}

pub fn change_value2(val: Result<i32, ()>) -> Result<i32, ()> {
    match val {
        Ok(x) => Ok(x * 2),
        Err(()) => Err(())
    }
}

I expected to see this happen:
in release build this code would translated into exact the same assembly/llvm-ir

Instead, this happened:

change_value:
	xorl	%eax, %eax
	testl	%edi, %edi
	setne	%al
	leal	(%rsi,%rsi), %edx
	retq
change_value2:
	movl	%edi, %eax
	leal	(%rsi,%rsi), %edx
	retq

and the llvm-ir is different:

define { i32, i32 } @_ZN10playground12change_value17h0230bd4091fd6529E(i32 noundef %0, i32 %1) unnamed_addr #0 {
start:
  %2 = icmp eq i32 %0, 0
  %_4 = shl i32 %1, 1
  %not. = xor i1 %2, true
  %.sroa.0.0 = zext i1 %not. to i32
  %.sroa.3.0 = select i1 %2, i32 %_4, i32 undef
  %3 = insertvalue { i32, i32 } undef, i32 %.sroa.0.0, 0
  %4 = insertvalue { i32, i32 } %3, i32 %.sroa.3.0, 1
  ret { i32, i32 } %4
}
define { i32, i32 } @_ZN10playground13change_value217h3302fbc2853d0a41E(i32 noundef %0, i32 %1) unnamed_addr #0 {
start:
  %switch = icmp eq i32 %0, 0
  %_4 = shl i32 %1, 1
  %.sroa.3.0 = select i1 %switch, i32 %_4, i32 undef
  %2 = insertvalue { i32, i32 } undef, i32 %0, 0
  %3 = insertvalue { i32, i32 } %2, i32 %.sroa.3.0, 1
  ret { i32, i32 } %3
}

Meta

I tried thin in rust-playground for 1.68.2, beta and nightly.

See also https://internals.rust-lang.org/t/why-match-is-better-then-if/18636

@davemilter davemilter added the C-bug Category: This is a bug. label Apr 8, 2023
@jyn514
Copy link
Member

jyn514 commented Apr 9, 2023

To be clear, is the bug that they are different? or that the first one has a missed optimization? we don't guarantee at the language level that if is lowered to match.

@jyn514 jyn514 added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 9, 2023
@steffahn
Copy link
Member

steffahn commented Apr 9, 2023

As far as I know from the linked IRLO discussion, the main point is that it's a missed optimization. Besides the already posted information, looking at the MIR is also instructional, as things like looking at the case where match is used with a _ => pattern instead of Err(()) =>, or comparing the behavior here (where a 2-variant enum discriminant is used) with how similar code using a bool would be optimized. I will not reproduce the whole discussion here, but it's worth a look.

@jyn514 jyn514 changed the title match and if converted to different llvm-ir missed optimization for if, even though equivalent match is optimized Apr 9, 2023
@jyn514 jyn514 changed the title missed optimization for if, even though equivalent match is optimized missed optimization for if let, even though equivalent match is optimized Apr 9, 2023
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@dianqk
Copy link
Member

dianqk commented Feb 28, 2024

@rustbot label +A-mir-opt
@rustbot claim

@rustbot rustbot added the A-mir-opt Area: MIR optimizations label Feb 28, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 28, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 29, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 29, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 8, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
@bors bors closed this as completed in 14fbc3c Mar 8, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-mir-opt Area: MIR optimizations C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants