Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Apply Profile-guided optimization to improve performance #9412

Open
FilipAndersson245 opened this issue Jun 26, 2021 · 14 comments
Open

Apply Profile-guided optimization to improve performance #9412

FilipAndersson245 opened this issue Jun 26, 2021 · 14 comments
Labels
A-infra CI and workflow issues A-perf performance issues C-enhancement Category: enhancement

Comments

@FilipAndersson245
Copy link

Profile-guided optimization (PGO) shows great promise in improving the speed of software, last year tests where made on applying it on Rust itself improving build time by ~15%.
Would it be feasible to do something similar for Rust analyzer to improve its speed?
As i understand it the difficulties would be

  • gathering runtime statistics
  • longer compilation times
@matklad
Copy link
Member

matklad commented Jun 27, 2021

It would be feasibly, but the tradeoff between additional perf boost and additional burden of maintaining a more complex build process is not worth it at this stage. It’s more impactful to spend the effort on making rust-analyzer more performant directly. The primary blocker for that work is understanding rust-analyzer’s heap structure: #9309

@lnicola
Copy link
Member

lnicola commented Jun 28, 2021

I just tried this, it's an ~8% improvement in analysis-stats, especially in the type inference:

baseline:

Database loaded:     651.08ms, 278minstr
  crates: 36, mods: 715, decls: 14906, fns: 11073
Item Collection:     9.72s, 74ginstr
  exprs: 301267, ??ty: 515 (0%), ?ty: 582 (0%), !ty: 220
Inference:           15.01s, 110ginstr
Total:               24.73s, 185ginstr

pgo:

Database loaded:     638.59ms, 273minstr
  crates: 36, mods: 715, decls: 14906, fns: 11073
Item Collection:     9.43s, 73ginstr
  exprs: 301267, ??ty: 515 (0%), ?ty: 582 (0%), !ty: 220
Inference:           13.28s, 107ginstr
Total:               22.71s, 180ginstr

Steps for posterity, since they weren't obvious:

RUSTFLAGS="-C profile-generate" cargo build --release
target/release/rust-analyzer analysis-stats .
llvm-profdata merge *.profraw --output merged.profdata
RUSTFLAGS="-C profile-use=$PWD/merged.profdata" cargo build --release

Probably not worth the hassle for now.

@Veykril Veykril added the A-perf performance issues label May 28, 2022
@FilipAndersson245
Copy link
Author

Was some time since this issue and RA is maturing quite nicely, maybe we should look over this again?
saw Kobzol made this https://github.com/Kobzol/cargo-pgo to simplify the process of getting a PGO binary.

@lnicola lnicola added the A-infra CI and workflow issues label Nov 1, 2022
@lnicola
Copy link
Member

lnicola commented Nov 1, 2022

cargo pgo bolt build --with-pgo appears to crash BOLT on my system (LLVM 14.0.6, BOLT 14.0.6), and BOLT doesn't seem to be packaged with LLVM 14 in the distros I've tried, making it a bit hard to acquire. I still don't think it's worth bothering with that. As for plain PGO, on self:

# baseline
Database loaded:     508.15ms, 256minstr (metadata 317.41ms, 23minstr; build 102.87ms, 9210kinstr)
  crates: 43, mods: 869, decls: 19387, fns: 14479
Item Collection:     10.34s, 89ginstr
  exprs: 411733, ??ty: 45 (0%), ?ty: 130 (0%), !ty: 1                                                                                                                
Inference:           37.58s, 265ginstr
Total:               47.92s, 355ginstr

# pgo
Database loaded:     489.57ms, 248minstr (metadata 304.50ms, 21minstr; build 104.55ms, 8241kinstr)
  crates: 43, mods: 869, decls: 19387, fns: 14479
Item Collection:     8.99s, 76ginstr
  exprs: 411733, ??ty: 45 (0%), ?ty: 130 (0%), !ty: 1                                                                                                                
Inference:           29.43s, 227ginstr
Total:               38.42s, 304ginstr

for a speed-up of 20%.

For sysroot:

# baseline
Database loaded:     570.04ms, 132minstr (metadata 250.14ms, 2907kinstr; build 280.13ms, 31minstr)
  crates: 35, mods: 1267, decls: 39546, fns: 26480
Item Collection:     4.28s, 46ginstr
  exprs: 421064, ??ty: 42223 (10%), ?ty: 18705 (4%), !ty: 265                                                       
Inference:           29.26s, 213ginstr
Total:               33.54s, 259ginstr

# pgo
Database loaded:     547.65ms, 126minstr (metadata 253.65ms, 2687kinstr; build 255.81ms, 26minstr)
  crates: 35, mods: 1267, decls: 39546, fns: 26480
Item Collection:     3.71s, 41ginstr
  exprs: 421064, ??ty: 42223 (10%), ?ty: 18705 (4%), !ty: 265                                                       
Inference:           23.42s, 177ginstr
Total:               27.13s, 219ginstr

(19% faster)

My steps in #9412 (comment) still work, but seem to yield a smaller improvement. I'm not sure if cargo pgo is doing some extra magic, or it's just measurement noise.

In any case, 15-20% is a decent improvement.

@FilipAndersson245
Copy link
Author

Great! This looks awesome, 15-20% is a huge improvement.

@lnicola
Copy link
Member

lnicola commented Nov 1, 2022

15-20% is a drop in the ocean compared to the algorithmic improvements (that nobody had time/managed/knew how to make) 😅.

@Veykril Veykril added the C-enhancement Category: enhancement label Feb 9, 2023
@zamazan4ik
Copy link

Just to history - I've applied PGO (without BOLT) to Clangd (a project similar to Rust Analyzer but for C++) and got nice improvements as well: link.

@FilipAndersson245
Copy link
Author

FilipAndersson245 commented Apr 15, 2024

@lnicola It has been quite some time since the last time PGO was evaluated, Are rust-analyzer in a better state where it may be suitable to distribute PGO-optimized builds?

@ofek
Copy link

ofek commented Dec 1, 2024

Are PGO builds now distributed?

@ChayimFriedman2
Copy link
Contributor

Are PGO builds now distributed?

No. There are still bigger wins to be gained (which nobody tries currently AFAIK).

@ofek
Copy link

ofek commented Dec 1, 2024

Are you referring to this? #9309

@ChayimFriedman2
Copy link
Contributor

@ofek Performance work is tracked in #17491.

@berkus
Copy link

berkus commented Dec 2, 2024

@ChayimFriedman2 jfyi, i am using rust-analyzer every day, all day - every single percent shaved off of runtime of this tool is a huge win for my work day. I urge you to not ignore a 15% win that is basically coming for free.

@FilipAndersson245
Copy link
Author

I think as of right now there is a bug with PGO + LTO in the rust compiler preventing both to be used simultaneously, so any exploration of eventual performance would probably be best after that is fixed.
rust-lang/rust#115344

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-infra CI and workflow issues A-perf performance issues C-enhancement Category: enhancement
Projects
None yet
Development

No branches or pull requests

8 participants