Skip to content

Rollup of 6 pull requests #84788

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 14 commits into from

Conversation

Dylan-DPC-zz
Copy link

Successful merges:

Failed merges:

r? @ghost
@rustbot modify labels: rollup

Create a similar rollup

nagisa and others added 14 commits April 11, 2021 01:18
This enables us to set more generic labels shared between targets. For
example `target_family="wasm"` across all targets that are conceptually
"wasm".

See rust-lang/reference#1006
…true`

Previously, the LD_LIBRARY_PATH for the linkchecker looked like
`build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/x86_64-unknown-linux-gnu/lib`, because the linkchecker depends on the master copy of the standard library. This is true, but doesn't include the library path for the compiler libraries:

```
/home/joshua/src/rust/rust/build/x86_64-unknown-linux-gnu/stage1-tools-bin/error_index_generator: error while loading shared libraries: libLLVM-12-rust-1.53.0-nightly.so: cannot open shared object file: No such file or directory
```

That file is in
`build/x86_64-unknown-linux-gnu/stage1/lib/libLLVM-12-rust-1.53.0-nightly.so`,
which wasn't included in the dynamic path. This adds `build/x86_64-unknown-linux-gnu/stage1/lib` to the dynamic path for the linkchecker.
On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282

static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553

Tested on Graviton2 aarch64-linux with `./x.py test`.
This should fix linking of other C code (and soon Rust-generated code)
on aarch64 musl.
Enable outline-atomics by default as enabled in clang by the following commit
https://reviews.llvm.org/rGc5e7e649d537067dec7111f3de1430d0fc8a4d11

Performance improves by several orders of magnitude when using the LSE instructions
instead of the ARMv8.0 compatible load/store exclusive instructions.

Tested on Graviton2 aarch64-linux with
x.py build && x.py install && x.py test
…htriplett

[aarch64] add target feature outline-atomics

Enable outline-atomics by default as enabled in clang by the following commit
https://reviews.llvm.org/rGc5e7e649d537067dec7111f3de1430d0fc8a4d11

Performance improves by several orders of magnitude when using the LSE instructions
instead of the ARMv8.0 compatible load/store exclusive instructions.

Tested on Graviton2 aarch64-linux with
x.py build && x.py install && x.py test
… r=petrochenkov

Allow setting `target_family` to multiple values, and implement `target_family="wasm"`

As per the conclusion in [this thread](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Are.20we.20comfortable.20with.20adding.20an.20insta-stable.20cfg.28wasm.29.3F/near/233158441), this implements an ability to specify any number of `target_family` values, allowing for more flexible generic groups, or "families", to be created than just the OS-based unix/windows dichotomy.

cc rust-lang/reference#1006
…acrum

Allow running `x.py test --stage 2 src/tools/linkchecker` with `download-rustc = true`

Previously, the LD_LIBRARY_PATH for the linkchecker looked like
`build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/x86_64-unknown-linux-gnu/lib`, because the linkchecker depends on the master copy of the standard library. This is true, but doesn't include the library path for the compiler libraries:

```
/home/joshua/src/rust/rust/build/x86_64-unknown-linux-gnu/stage1-tools-bin/error_index_generator: error while loading shared libraries: libLLVM-12-rust-1.53.0-nightly.so: cannot open shared object file: No such file or directory
```

That file is in
`build/x86_64-unknown-linux-gnu/stage1/lib/libLLVM-12-rust-1.53.0-nightly.so`,
which wasn't included in the dynamic path. This adds `build/x86_64-unknown-linux-gnu/stage1/lib` to the dynamic path for the linkchecker.
[Arm64] use isb instruction instead of yield in spin loops

On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

```
$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282
```

```
static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553
```

Tested on Graviton2 aarch64-linux with `./x.py test`.
…ns, r=Amanieu

Update compiler-builtins to 0.1.41 to get fix for outlined atomics

This should fix linking of other C code (and soon Rust-generated code)
on aarch64 musl.
@rustbot rustbot added the rollup A PR which is a rollup label May 1, 2021
@Dylan-DPC-zz Dylan-DPC-zz deleted the rollup-qevgi10 branch May 1, 2021 12:44
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
rollup A PR which is a rollup
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants