Skip to content

✨[Feature] Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) #2511

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
zamazan4ik opened this issue Dec 2, 2023 · 3 comments
Assignees
Labels
feature request New feature or request

Comments

@zamazan4ik
Copy link

Is your feature request related to a problem? Please describe.

Not a problem. An idea about how the TensorRT performance can be improved.

I checked Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) improvements on multiple projects. The results are available here. According to the tests, these optimizations can help with achieving better performance in many cases for many applications: compilers and interpreters, static analysis, databases, networking, etc. Since this, I think optimizing TensorRT (its C++ part) with PGO and PLO would be a good idea.

Describe the solution you'd like

I can suggest the following things:

  • Perform PGO benchmarks on TensorRT. If it shows improvements - add a note to the documentation about possible improvements in TensorRT performance with PGO.
  • Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize TensorRT according to their workloads.
  • Optimize pre-built TensorRT binaries

Additional context

As an additional optimization step after PGO, I can suggest Post-Link Optimization (PLO) with a tool like LLVM BOLT. I think it's still worth evaluating it only after the PGO integration into TensorRT.

Here I collected several PGO-related links (more PGO-related materials available at https://github.com/zamazan4ik/awesome-pgo/).

Examples of how PGO optimization is integrated into other projects:

I have some examples of how PGO information looks in the documentation:

Regarding LLVM BOLT integration, I have the following examples:

@zamazan4ik zamazan4ik added the feature request New feature or request label Dec 2, 2023
@narendasan
Copy link
Collaborator

Do you think this is more geared towards TensorRT itself or the PyTorch extension? This might be more relevant to open in https://github.com/nvidia/pytorch

@zamazan4ik
Copy link
Author

zamazan4ik commented Dec 2, 2023

This might be more relevant to open in https://github.com/nvidia/pytorch

For this page, I get HTTP 404. Does it have some special access requirements or just the link is wrong?

@narendasan
Copy link
Collaborator

Sorry wrong url https://github.com/nvidia/tensorrt

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants