Skip to content

ptr::read() is slower than directly dereferencing #73258

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
jrmuizel opened this issue Jun 11, 2020 · 5 comments · Fixed by #109035
Closed

ptr::read() is slower than directly dereferencing #73258

jrmuizel opened this issue Jun 11, 2020 · 5 comments · Fixed by #109035
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jrmuizel
Copy link
Contributor

#[derive(Clone, Copy)]
pub enum Foo {
    A, B, C, D,
}

pub fn foo(v: *const Foo) -> Foo {
    let k: Option<Foo> = unsafe { Some(*v) };
    return k.unwrap();
}

compiles to

example::foo:
        mov     al, byte ptr [rdi]
        ret

but replacing *v with v.read()

#[derive(Clone, Copy)]
pub enum Foo {
    A, B, C, D,
}

pub fn foo(v: *const Foo) -> Foo {
    let k: Option<Foo> = unsafe { Some(v.read()) };
    return k.unwrap();
}

compiles to:

example::foo:
        push    rax
        mov     al, byte ptr [rdi]
        cmp     al, 4
        je      .LBB0_1
        pop     rcx
        ret
.LBB0_1:
        lea     rdi, [rip + .L__unnamed_1]
        lea     rdx, [rip + .L__unnamed_2]
        mov     esi, 43
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2

This came up from #71257

@jrmuizel jrmuizel added the C-bug Category: This is a bug. label Jun 11, 2020
@alex
Copy link
Member

alex commented Jun 11, 2020

godbolt for the bad version here: https://rust.godbolt.org/z/CWboVo

took me a second, but the panic is from the unwrap, suggesting that the discriminant range isn't propagated correctly

@jonas-schievink jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-codegen Area: Code generation labels Jun 11, 2020
@jrmuizel
Copy link
Contributor Author

This is presumably because the copy_nonoverlapping in the implementation of ptr::read() does not emit range metadata.

@hanna-kruppe
Copy link
Contributor

This matches my impression from skimming over -print-after-all output, though I'll add that the call to ptr::read() does have the range metadata, so if that metadata got propagated before or during inlining, we'd probably be in a better position. But a simpler solution in the short term might be to implement ptr::read with a compiler intrinsic instead of the current MaybeUninit+copy_nonoverlapping contraption.

@scottmcm
Copy link
Member

LLVM-IR diff: https://rust.godbolt.org/z/vMex9ooxq

Looks like the root cause is that read doesn't get !range metadata:

-  %_3 = load i8, ptr %v, align 1, !dbg !11, !range !13, !noundef !10
+  %tmp.0.copyload = load i8, ptr %v, align 1, !dbg !11

Will hopefully be fixed by #109035

@scottmcm
Copy link
Member

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants