You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
case:
ffmpeg -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -hwaccel_device 0 -c:v h264_cuvid -i input.h264 -c:v nvenc_h264 output.h264
Q:
The encoding and decoding are both hardware-accelerated using GPUs, with decoding completed on the GPU and encoding also completed on the GPU, without any memory copying from GPU to host memory.
In this case, what are the differences between BMF’s processing of the decoding process and CPU mode decoding? Are both using vaFrame to receive decoding results, with the address in vaFrame being on the GPU for one and on the CPU for the other?
Also, in this scenario, does BMF copy the GPU data back to memory, encapsulate it as a TASK, and place it in the scheduling queue? Would this not change the original purpose of reducing memory copies?
The text was updated successfully, but these errors were encountered:
In the BMF framework, we support data flow on the GPU-only link. If your entire pipeline runs on the GPU, data should not be copied to the CPU. The demo master/bmf/demo/gpu_module is a case of using the BMF framework to perform image processing on the GPU link. In my opinion, in addition to supporting data flow on the GPU link, BMF has the following advantages:
BMF supports convenient data transfer between CPU/GPU
BMF provides a convenient backend interface for data transfer between common image processing frameworks such as bmf, torch, and opencv.
case:
ffmpeg -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -hwaccel_device 0 -c:v h264_cuvid -i input.h264 -c:v nvenc_h264 output.h264
Q:
The encoding and decoding are both hardware-accelerated using GPUs, with decoding completed on the GPU and encoding also completed on the GPU, without any memory copying from GPU to host memory.
In this case, what are the differences between BMF’s processing of the decoding process and CPU mode decoding? Are both using vaFrame to receive decoding results, with the address in vaFrame being on the GPU for one and on the CPU for the other?
Also, in this scenario, does BMF copy the GPU data back to memory, encapsulate it as a TASK, and place it in the scheduling queue? Would this not change the original purpose of reducing memory copies?
The text was updated successfully, but these errors were encountered: