Skip to content

Compile constant SIMD initialiser to a constant vector expression #18147

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
huonw opened this issue Oct 18, 2014 · 5 comments
Closed

Compile constant SIMD initialiser to a constant vector expression #18147

huonw opened this issue Oct 18, 2014 · 5 comments
Labels
A-codegen Area: Code generation A-SIMD Area: SIMD (Single Instruction Multiple Data) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@huonw
Copy link
Member

huonw commented Oct 18, 2014

Currently

#![crate_type = "lib"]

pub fn foo(x: f64, y:f64) -> std::simd::f64x2 {
    std::simd::f64x2(0.0, 1.0)
}

becomes, with no optimisations,

; Function Attrs: uwtable
define <2 x double> @_ZN3foo20h36a71d373a6347d3daaE(double, double) unnamed_addr #0 {
entry-block:
  %sret_slot = alloca <2 x double>
  %x = alloca double
  %y = alloca double
  store double %0, double* %x
  store double %1, double* %y
  %2 = getelementptr inbounds <2 x double>* %sret_slot, i32 0, i32 0
  store double 0.000000e+00, double* %2
  %3 = getelementptr inbounds <2 x double>* %sret_slot, i32 0, i32 1
  store double 1.000000e+00, double* %3
  %4 = load <2 x double>* %sret_slot
  ret <2 x double> %4
}

After optimisations it becomes

; Function Attrs: nounwind readnone uwtable
define <2 x double> @_ZN3foo20h36a71d373a6347d3daaE(double, double) unnamed_addr #0 {
entry-block:
  ret <2 x double> <double 0.000000e+00, double 1.000000e+00>
}

We could detect constants in a SIMD initialiser and compile to this directly, making our no-opt code faster, and saving the optimiser work.

@steveklabnik
Copy link
Member

@huonw with std::simd gone, is this still an issue? I haven't been keeping as close an eye on your simd work.

@huonw
Copy link
Member Author

huonw commented Feb 2, 2016

Yes, this applies to any #[repr(simd)] type.

@Mark-Simulacrum Mark-Simulacrum added A-SIMD Area: SIMD (Single Instruction Multiple Data) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 22, 2017
@nox
Copy link
Contributor

nox commented Mar 31, 2018

Cc @rust-lang/wg-compiler-performance now that SIMD support is going to be stabilised.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 15, 2020

According to Godbolt
rustc +nightly --emit=llvm-ir -Copt-level=0 now gives

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @_ZN7example3foo17h2ac4b13db8a0abecE
  (<2 x double>* noalias nocapture sret dereferenceable(16) %0, double %x, double %y)
  unnamed_addr #0 !dbg !6 {
    %1 = bitcast <2 x double>* %0 to double*, !dbg !10
    store double 0.000000e+00, double* %1, align 16, !dbg !10
    %2 = getelementptr inbounds <2 x double>, <2 x double>* %0, i32 0, i32 1, !dbg !10
    store double 1.000000e+00, double* %2, align 8, !dbg !10
    ret void, !dbg !11
}

attributes #0 = { nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.dbg.cu = !{!3}

and -Copt-level=3 now gives

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @_ZN7example3foo17h2ac4b13db8a0abecE
  (<2 x double>* noalias nocapture sret dereferenceable(16) %0, double %x, double %y)
  unnamed_addr #0 !dbg !6 {
    store <2 x double> <double 0.000000e+00, double 1.000000e+00>, <2 x double>* %0, align 16, !dbg !10
    ret void, !dbg !11
}

attributes #0 = { nofree norecurse nounwind nonlazybind uwtable writeonly
"probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

I'm... not sure how to read what seem like significant syntactic differences in these IR outputs, but I have a hunch that the result is "no change".

@workingjubilee
Copy link
Member

Hm... With new eyes, I can affirm. No change.
Soluble, though.

@bors bors closed this as completed in 540891b Oct 17, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-codegen Area: Code generation A-SIMD Area: SIMD (Single Instruction Multiple Data) C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants