Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #202

Open
zamazan4ik opened this issue Dec 3, 2023 · 0 comments

Comments

@zamazan4ik
Copy link

Hi!

Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are available here. According to the tests, PGO can help with achieving better performance in many cases similar to lol-html. I think trying to optimize lol-html with PGO can be a good idea.

I already did some benchmarks and want to share my results.

Test environment

  • Fedora 39
  • Linux kernel 6.5.12
  • AMD Ryzen 9 5900x
  • 48 Gib RAM
  • SSD Samsung 980 Pro 2 Tib
  • Compiler - Rustc 1.74
  • lol-html version: the latest for now from the master branch on commit 44a7659d1ce018c27ae1f7e7913884bdedf72d71
  • Disabled Turbo boost

Benchmark

For benchmark purposes, I use cargo bench benchmark. For PGO optimization I use cargo-pgo tool. The same benchmark suite was used for the PGO training phase built with cargo pgo bench. PGO optimized results I got with cargo pgo optimize bench.

Results

I got the following results:

As I interpret the results, PGO measurably improves lol-html performance in many cases. Since this library is used internally in Cloudflare Workers, such performance improvement can be valuable.

Further steps

I can suggest the following action points:

  • Perform more PGO benchmarks on lol-html. If it shows improvements - add a note to the documentation about possible improvements in lol-html performance with PGO.
  • Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize lol-html according to their workloads.

Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.

Here are some examples of how PGO optimization is integrated into other projects:

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant