Skip to content

rustc 1.48.0+ generates branchier and less compact code for integer division w/ boundary checks #99961

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
mqudsi opened this issue Jul 30, 2022 · 1 comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mqudsi
Copy link
Contributor

mqudsi commented Jul 30, 2022

The following code, which correctly bypasses overflow and divide by zero conditions, generates branchier and less optimal machine code (under x86_64) when compiled under rustc 1.48.0+ as compared to rustc 1.46.0 and below:

pub fn checked_div_i64(dividend: i64, divisor: i64) -> Option<i64> {
    if dividend > i64::min_value() && divisor != 0 {
            Some(dividend / divisor)
    } else {
        None
    }
}

Godbolt comparison link

The assembly generated in 1.46.0:

example::checked_div_i64:
        xor     eax, eax
        movabs  rcx, -9223372036854775808
        cmp     rdi, rcx
        je      .LBB0_3
        test    rsi, rsi
        je      .LBB0_3
        mov     rax, rdi
        cqo
        idiv    rsi
        mov     rdx, rax
        mov     eax, 1
.LBB0_3:
        ret

vs in 1.48.0:

example::checked_div_i64:
  xor eax, eax
  movabs rcx, -9223372036854775808
  cmp rdi, rcx
  je .LBB0_6
  test rsi, rsi
  je .LBB0_6
  mov rax, rdi
  or rax, rsi
  shr rax, 32
  je .LBB0_3
  mov rax, rdi
  cqo
  idiv rsi
  mov rdx, rax
  jmp .LBB0_5
.LBB0_3:
  mov eax, edi
  xor edx, edx
  div esi
  mov edx, eax
.LBB0_5:
  mov eax, 1
.LBB0_6:
  ret

The astute will observe that rustc 1.47.0 was skipped in the good/bad demarcation above. That's because rustc 1.47.0 introduced an incorrect checked division w/ panic (which is now happening again). My gut feeling is that whatever fixed 1.47's dismal codegen resulted in a still-suboptimal solution as compared to what we had before.

The problem still persists in the latest nightlies, but is obscured by #99960.

@rustbot label +regression-from-stable-to-stable +A-codegen +A-llvm +T-compiler

@mqudsi mqudsi added the C-bug Category: This is a bug. label Jul 30, 2022
@rustbot rustbot added A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 30, 2022
@mqudsi
Copy link
Contributor Author

mqudsi commented Jul 30, 2022

After the discussion in #99960, I'm going to close this because it's just tracking the minutiae of LLVM's optimization changes between LLVM 10 and LLVM 11, and the penalty isn't that big of a deal (compared to generating the panic code). There are some minor differences in the unoptimized IR but I don't think they're responsible.

@rustbot label -I-prioritize

@mqudsi mqudsi closed this as completed Jul 30, 2022
@rustbot rustbot removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jul 30, 2022
@workingjubilee workingjubilee closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants