-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
35% performance regression in generated code since 1.24 #53833
Comments
May that's the cause? Could you try with one codegen unit? As for the improvement in 1.25.0, you might thank the LLVM update for it :). |
Wow, the difference is enormous! With
1.25.0 is even a bit faster than 1.23.0 now (due to the LLVM upgrade?):
Although it looks like that advantage has been lost again. 1.26.0 was still fast:
But 1.27.0 again performs similar to 1.23.0.
1.28.0 performs similarly to 1.27.0. Thanks for the suggestion @est31! Does that resolve this issue, or is the 1.26 – 1.27 difference still worth investigating? |
There's a tracking issue for codegen unit regressions in #47745 . I think all the issues it links to are kept open. About the 1.26 -> 1.27 regression, it could probably be narrowed down using rustup and nightlies, and/or cargo-bisect-rustc. Not sure though whether it's worth the effort. |
Since Rust 1.24.0, the number of codegen units is no longer 1, even for release builds. This speeds up builds slightly, but it reduces the quality of the generated code tremendously. To get decent performance out of Claxon, we need to explicitly pass -C codegen-units=1 as RUSTFLAGS. Update the benchmarks to do so, and add a note in the readme. Thanks to est31 for pointing this out. See also rust-lang/rust#53833.
triage: P-medium, since I don't think investigating this takes priority over other things currently being juggled. but also self-assigning since I'm curious and I think I can do some bisection as a background task, assuming I can reproduce the perf regression locally. |
Summary
Claxon, when compiled with Rust 1.25.0 or later, takes 1.36 times as long to run as a version compiled with Rust 1.23.0. When compiled with Rust 1.24.0, it takes 1.45 times as long to run.
It looks like there was a severe regression in generated code in 1.24.0. 1.25.0 improved a bit again, but is still significantly worse than 1.23.0.
Steps to reproduce
Prepare:
You can also copy your own files into
testsamples/extra
if you happen to have some lying around.Then, with a Rust 1.23.0 toolchain, or after changing this line to use
cargo +1.23.0
:Then, with a Rust 1.25.0 toolchain, or after updating the script to use
cargo +1.25.0
:Output in my case:
The numbers in the rightmost column show the running time of the benchmark compiled with Rust 1.25.0 relative to the running time of the benchmark compiled with Rust 1.23.0.
For Rust 1.23.0 vs Rust 1.24.0 I get these results:
For Rust 1.23.0 vs Rust 1.28.0 I get these results:
For Rust 1.29.0-beta.1 and 1.30.0-nightly (3edb355 2018-08-03) I get similar results.
To show that the setup works, this is Rust 1.13.0 vs Rust 1.23.0, which shows no significant difference:
And Rust 1.25.0 vs Rust 1.28.0 does not show a significant difference either:
Details
Claxon is a decoder for the flac audio format; the benchmark decodes every file in
testsamples/extra
5 times and collects statistics about the duration.The benchmark is compiled with
-C target_cpu=native
.I use a Skylake i7.
The text was updated successfully, but these errors were encountered: