-
Notifications
You must be signed in to change notification settings - Fork 363
❓ [Question] Profiling examples? #1467
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
If you For Torchscript, its a matter of finding all attributes that are of type For example in TorchScript if you have a graph like the following for a module instance called
You can do The current standard FX TRTModule will print out layer timings when you enable profiling, the experimental runtime and TorchScript will save the layer timings to a JSON file you can visualize with perfetto as well as print them out. |
Thank you @narendasan this is very helpful.
Can you point me to something that would show how to do this for the TorchScript front end? For my INT8 models I'm using PTQ like:
I would love to profile this without having to figure out the Also, is there a way to return a dictionary with the timings rather than dump them to a file? Or if not, is there a way to specify / get the JSON filepath it's being dumped to? I don't see any JSON file getting output to. |
For profiling with the TS front end, I see Is there a way to use PTQ with FX via the new unified front end, so I could stick to FX for everything (including profiling)? The FX profiler is great, and easy to customize! I just need to be able to do anything/everything that I could do with |
Right now there isn't a way to tell the runtime to enable profiling at compile time since they are mostly completely separate. For the first version of this unfortunately the "hacky" way, looking for attributes is what we have. Could totally see us adding a context manager where you could do something like with torch_tensorrt.execution_profiling_enabled(module=trt_mod, profile_path="x"):
output = trt_mod(input)
where the manager would go through the module for you and enable profiling.
Right now, its only the json trace format that is supported (engine layer execution timings can also be dumped to console with the appropriate logging level - INFO) . Location of these traces can be set as a field of the |
TRTModuleNext will be the class of module returned to you if you use the Don't think FX exposes a place for your calibrator right now but it is something that is easy to hack in (just havent had time to do it properly yet / not sure how it would work for split graphs). But the DataLoaderCalibrator shipped in torch_tensorrt.ptq is full compatible the TRT python API so you should be able to just set it as part of the builder in TRTInterpreter |
That would be awesome. For now I can make the "hacky" way work.
Sounds good.
Got it, I'll try this out and see how my results differ for FX / Experimental FX / TS and open a separate question for this if I'm confused or the results differ significantly. As of right now I think #1452 will cause FX and maybe the experimental FX to give worse results.
Okay, I dug into the code a bit more and I get it now. Once #1452 is resolved I'll see if I can make this work. I really appreciate all the help here, thank you so much. If I don't have any more profiling-related questions in the next week or so I'll close the issue. |
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days |
❓ Question
When I'm not using TensorRT, I run my model through an FX interpreter that times each call op (by inserting CUDA events before/after and measuring the elapsed time). I'd like to do something similar after converting/compiling the model to TensorRT, and I see there is some profiling built in with tensorrt.Proflier but its usage isn't clear to me.
Is there an example anywhere on how to time each layer or op with this profiler, or any other means of profiling the TensorRT engine/layers? I don't mind messing with the op converters to do so, but I don't want to have to wrap every op converter my model uses. More generally I think I could use the PyTorch profiler but it would be difficult to parse the output to get clear per-layer/per-op results.
The text was updated successfully, but these errors were encountered: