Open
Description
Is your feature request related to a problem? Please describe.
Currently the fastest way of executing models for Computer Vision inference is by running a TensorRT-optimised model. It is widely available in C/C++ but you cannot really use it in C#.
Describe the solution you'd like
I would like to be able to load the TensorRT engine into C# memory and call it from there using OpenCVSharp's Mat
structures.
Describe alternatives you've considered
We are currently using Triton Inference Server but it adds overhead time for data serialisation and transmission.
Additional context
There are certain scenarios that would benefit greatly from calling a TensorRT model in-process such as Quality Control.