Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Subtree sync for rustc_codegen_cranelift #127162

Merged
merged 71 commits into from
Jun 30, 2024

Conversation

bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Jun 30, 2024

The main highlight this time is support for arm64 macOS in cg_clif. A future PR will enable distributing it as rustup component.

r? @ghost

@rustbot label +A-codegen +A-cranelift +T-compiler

pacak and others added 30 commits April 19, 2024 08:31
…, r=RalfJung,nikic

compiler: add simd_ctpop intrinsic

Fairly straightforward addition.

cc `@rust-lang/opsem` new (extremely boring) intrinsic
Typical uses of ThinLTO don't have any use for this as a standalone
file, but distributed ThinLTO uses this to make the linker phase more
efficient. With clang you'd do something like `clang -flto=thin
-fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o
(full of bitcode) and foo.indexing.o (just the summary or index part of
the bitcode). That's then usable by a two-stage linking process that's
more friendly to distributed build systems like bazel, which is why I'm
working on this area.

I talked some to @teresajohnson about naming in this area, as things
seem to be a little confused between various blog posts and build
systems. "bitcode index" and "bitcode summary" tend to be a little too
ambiguous, and she tends to use "thin link bitcode" and "minimized
bitcode" (which matches the descriptions in LLVM). Since the clang
option is thin-link-bitcode, I went with that to try and not add a new
spelling in the world.

Per @dtolnay, you can work around the lack of this by using `lld
--thinlto-index-only` to do the indexing on regular .o files of
bitcode, but that is a bit wasteful on actions when we already have all
the information in rustc and could just write out the matching minimized
bitcode. I didn't test that at all in our infrastructure, because by the
time I learned that I already had this patch largely written.
The standard library now has the right configs in it's Cargo.toml
rustc_codegen_llvm: add support for writing summary bitcode

Typical uses of ThinLTO don't have any use for this as a standalone file, but distributed ThinLTO uses this to make the linker phase more efficient. With clang you'd do something like `clang -flto=thin -fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o (full of bitcode) and foo.indexing.o (just the summary or index part of the bitcode). That's then usable by a two-stage linking process that's more friendly to distributed build systems like bazel, which is why I'm working on this area.

I talked some to `@teresajohnson` about naming in this area, as things seem to be a little confused between various blog posts and build systems. "bitcode index" and "bitcode summary" tend to be a little too ambiguous, and she tends to use "thin link bitcode" and "minimized bitcode" (which matches the descriptions in LLVM). Since the clang option is thin-link-bitcode, I went with that to try and not add a new spelling in the world.

Per `@dtolnay,` you can work around the lack of this by using `lld --thinlto-index-only` to do the indexing on regular .o files of bitcode, but that is a bit wasteful on actions when we already have all the information in rustc and could just write out the matching minimized bitcode. I didn't test that at all in our infrastructure, because by the time I learned that I already had this patch largely written.
This replaces the drop_in_place reference with null in vtables. On
librustc_driver.so, this drops about ~17k dynamic relocations from the
output, since many vtables can now be placed in read-only memory, rather
than having a relocated pointer included.

This makes a tradeoff by adding a null check at vtable call sites.
That's hard to avoid without changing the vtable format (e.g., to use a
pc-relative relocation instead of an absolute address, and avoid the
dynamic relocation that way). But it seems likely that the check is
cheap at runtime.
Add an intrinsic for `ptr::metadata`

The follow-up to rust-lang#123840, so we can remove `PtrComponents` and `PtrRepr` from libcore entirely (well, after a bootstrap update).

As discussed in <https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/.60ptr_metadata.60.20in.20MIR/near/435637808>, this introduces `UnOp::PtrMetadata` taking a raw pointer and returning the associated metadata value.

By no longer going through a `union`, this should also help future PRs better optimize pointer operations.

r? ``@oli-obk``
Show files produced by `--emit foo` in json artifact notifications

Right now it is possible to ask `rustc` to save some intermediate representation into one or more files with `--emit=foo`, but figuring out what exactly was produced is difficult. This pull request adds information about `llvm_ir` and `asm` intermediate files into notifications produced by `--json=artifacts`.

Related discussion: https://internals.rust-lang.org/t/easier-access-to-files-generated-by-emit-foo/20477

Motivation - `cargo-show-asm` parses those intermediate files and presents them in a user friendly way, but right now I have to apply some dirty hacks. Hacks make behavior confusing: hintron/computer-enhance#35

This pull request introduces a new behavior: now `rustc` will emit a new artifact notification for every artifact type user asked to `--emit`, for example for `--emit asm` those will include all the `.s` files.

Most users won't notice this behavior, to be affected by it all of the following must hold:
- user must use `rustc` binary directly (when `cargo` invokes `rustc` - it consumes artifact notifications and doesn't emit anything)
- user must specify both `--emit xxx` and `--json artifacts`
- user must refuse to handle unknown artifact types
- user must disable incremental compilation (or deal with it better than cargo does, or use a workaround like `save-temps`) in order not to hit rust-lang#88829 / rust-lang#89149
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 30, 2024
@rustbot

This comment was marked as resolved.

@rustbot rustbot added A-codegen Area: Code generation A-cranelift Things relevant to the [future] cranelift backend labels Jun 30, 2024
@bjorn3
Copy link
Member Author

bjorn3 commented Jun 30, 2024

@bors r+ subtree sync

@bors
Copy link
Contributor

bors commented Jun 30, 2024

📌 Commit 9ec6a02 has been approved by bjorn3

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 30, 2024
@bjorn3
Copy link
Member Author

bjorn3 commented Jun 30, 2024

@bors p=1

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jun 30, 2024
…bjorn3

Subtree sync for rustc_codegen_cranelift

The main highlight this time is support for arm64 macOS in cg_clif. A future PR will enable distributing it as rustup component.

r? `@ghost`

`@rustbot` label +A-codegen +A-cranelift +T-compiler
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 30, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#126018 (Remove the `box_pointers` lint.)
 - rust-lang#126895 (Fix simd_gather documentation)
 - rust-lang#126981 (Replace some magic booleans in match-lowering with enums)
 - rust-lang#127069 (small correction to fmt::Pointer impl)
 - rust-lang#127157 (coverage: Avoid getting extra unexpansion info when we don't need it)
 - rust-lang#127160 (Add a regression test for rust-lang#123630)
 - rust-lang#127161 (Improve `run-make-support` library `args` API)
 - rust-lang#127162 (Subtree sync for rustc_codegen_cranelift)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors
Copy link
Contributor

bors commented Jun 30, 2024

⌛ Testing commit 9ec6a02 with merge ef3d6fd...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 30, 2024
Distribute rustc_codegen_cranelift for arm64 macOS

Support for arm64 macOS has been added to rustc_codegen_cranelift recently.

Based on rust-lang#127162

try-job: dist-aarch64-apple
@bors
Copy link
Contributor

bors commented Jun 30, 2024

☀️ Test successful - checks-actions
Approved by: bjorn3
Pushing ef3d6fd to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 30, 2024
@bors bors merged commit ef3d6fd into rust-lang:master Jun 30, 2024
7 checks passed
@rustbot rustbot added this to the 1.81.0 milestone Jun 30, 2024
@bjorn3 bjorn3 deleted the sync_cg_clif-2024-06-30 branch June 30, 2024 18:34
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 30, 2024
Distribute rustc_codegen_cranelift for arm64 macOS

Support for arm64 macOS has been added to rustc_codegen_cranelift recently.

Based on rust-lang#127162

try-job: dist-aarch64-apple
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ef3d6fd): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (secondary -2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.0% [-2.0%, -2.0%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 696.099s -> 697.442s (0.19%)
Artifact size: 324.69 MiB -> 324.64 MiB (-0.02%)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-codegen Area: Code generation A-cranelift Things relevant to the [future] cranelift backend has-merge-commits PR has merge commits, merge with caution. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.