Skip to content

nontemporal_store: make sure that the intrinsic is truly just a hint #128149

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 3 commits into from
Aug 12, 2024

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Jul 24, 2024

The !nontemporal flag for stores in LLVM sounds like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is highly dubious. Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the _mm_stream intrinsics are unaffected by this change.

Fixes #114582
Cc @Amanieu @workingjubilee

@rustbot
Copy link
Collaborator

rustbot commented Jul 24, 2024

r? @fmease

rustbot has assigned @fmease.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 24, 2024
@rustbot
Copy link
Collaborator

rustbot commented Jul 24, 2024

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the nontemporal_store branch from 05e50ef to 8d0d26e Compare July 24, 2024 19:40
@RalfJung RalfJung force-pushed the nontemporal_store branch 2 times, most recently from e2efa65 to d1e92af Compare July 24, 2024 20:03
@Kobzol
Copy link
Contributor

Kobzol commented Jul 24, 2024

It's a bit sad if this becomes an actual no-op on x86, nontemporal stores can be (very rarely) very useful. Is this usable through anything else than mm_stream_* and friends? These functions already contain a safety note that a fence has to be used after them.

@workingjubilee
Copy link
Member

It's a bit sad if this becomes an actual no-op on x86, nontemporal stores can be (very rarely) very useful. Is this usable through anything else than mm_stream_* and friends? These functions already contain a safety note that a fence has to be used after them.

Those intrinsic functions are, in the latest commits to std::arch, internally assembly. They should behave identically to how they did before, please file a bug if they emit notably worse codegen. The distinction is that, in general, LLVM does not try to perform store deduplication into assembly code.

@Kobzol
Copy link
Contributor

Kobzol commented Jul 25, 2024

So mm_stream will still emit nontemporal instructions on x86? Thanks for clarifying!

@RalfJung
Copy link
Member Author

Yes it will, that's why rust-lang/stdarch#1541 needs to propagate first.

@Kobzol
Copy link
Contributor

Kobzol commented Jul 25, 2024

Ok, great, then I misunderstood this change, sorry!

@workingjubilee
Copy link
Member

Thank you, @RalfJung.

@RalfJung RalfJung force-pushed the nontemporal_store branch from 0135bb1 to 697787a Compare August 5, 2024 08:57
@RalfJung
Copy link
Member Author

RalfJung commented Aug 5, 2024

rust-lang/stdarch#1541 has propagated, so this should be good to go. We haven't heard from @fmease though so let's re-roll the reviewer dice.

r? compiler

@rustbot rustbot assigned TaKO8Ki and unassigned fmease Aug 5, 2024
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The changes themselves look good based on the prior discussions. Just a question about the codegen test with regards to target coverage.

Comment on lines +740 to +741
const WELL_BEHAVED_NONTEMPORAL_ARCHS: &[&str] =
&["aarch64", "arm", "riscv32", "riscv64"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question [REV1 (2/2)]: If I understand it correctly, the codegen test is modified such that it only codegens for aarch64-unknown-linux-gnu target, but here we have 4 "well-behaved" architectures that we do emit !nontemporal for.

@jieyouxu
Copy link
Member

r? jieyouxu

@rustbot rustbot assigned jieyouxu and unassigned TaKO8Ki Aug 11, 2024
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, feel free to r=me,Amanieu,Jubilee after maybe fixing the nit :3

@RalfJung
Copy link
Member Author

@bors r=jieyouxu,Amanieu,Jubilee

@bors
Copy link
Collaborator

bors commented Aug 12, 2024

📌 Commit 5be6c7d has been approved by jieyouxu,Amanieu,Jubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 12, 2024
@RalfJung
Copy link
Member Author

@bors r=jieyouxu,Amanieu,Jubilee

@bors
Copy link
Collaborator

bors commented Aug 12, 2024

📌 Commit 75743dc has been approved by jieyouxu,Amanieu,Jubilee

It is now in the queue for this repository.

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 12, 2024
…llaumeGomez

Rollup of 10 pull requests

Successful merges:

 - rust-lang#128149 (nontemporal_store: make sure that the intrinsic is truly just a hint)
 - rust-lang#128394 (Unify run button display with "copy code" button and with mdbook buttons)
 - rust-lang#128537 (const vector passed through to codegen)
 - rust-lang#128632 (std: do not overwrite style in `get_backtrace_style`)
 - rust-lang#128878 (Slightly refactor `Flags` in bootstrap)
 - rust-lang#128886 (Get rid of some `#[allow(rustc::untranslatable_diagnostic)]`)
 - rust-lang#128929 (Fix codegen-units tests that were disabled 8 years ago)
 - rust-lang#128937 (Fix warnings in rmake tests on `x86_64-unknown-linux-gnu`)
 - rust-lang#128978 (Use `assert_matches` around the compiler more)
 - rust-lang#128994 (Fix bug in `Parser::look_ahead`.)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 095ca33 into rust-lang:master Aug 12, 2024
6 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Aug 12, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Aug 12, 2024
Rollup merge of rust-lang#128149 - RalfJung:nontemporal_store, r=jieyouxu,Amanieu,Jubilee

nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes rust-lang#114582
Cc `@Amanieu` `@workingjubilee`
@RalfJung RalfJung deleted the nontemporal_store branch August 14, 2024 12:15
bjorn3 pushed a commit to bjorn3/rust that referenced this pull request Sep 22, 2024
…ouxu,Amanieu,Jubilee

nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes rust-lang#114582
Cc `@Amanieu` `@workingjubilee`
antoyo pushed a commit to antoyo/rust that referenced this pull request Jan 13, 2025
…ouxu,Amanieu,Jubilee

nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes rust-lang#114582
Cc `@Amanieu` `@workingjubilee`
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-temporal stores (and _mm_stream operations in stdarch) break our memory model
10 participants