Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Multiple GPU #63

Open
guods opened this issue Aug 27, 2019 · 8 comments
Open

Multiple GPU #63

guods opened this issue Aug 27, 2019 · 8 comments

Comments

@guods
Copy link

guods commented Aug 27, 2019

Thank you for your work, but I have some questions:
How does the generated engine run on multiple graphics cards? How to set GPU Id number ?

@lewes6369
Copy link
Owner

It should not be the bottleneck for generating engine.You can save engine only for the first time, and latter you can load from engine file.

@guods
Copy link
Author

guods commented Sep 9, 2019

It is not be the bottleneck for generating engine。After create the engine file, I want to run the engine file on the specified GPU,so I set GPU ID by "cudaSetDevice", but it did not work.

@guods
Copy link
Author

guods commented Sep 9, 2019

Have you ever done this experiment:For graphics cards with the same architecture, engine files generated under low-profile graphics cards (1060)be used under high-profile graphics cards(1080)?

@zerollzeng
Copy link

hi @guods, it nice to see you again :)
for your first question:
Each ICudaEngine object is bound to a specific GPU when it is instantiated, either by the builder or on deserialization. To select the GPU, use cudaSetDevice() before calling the builder or deserializing the engine. Each IExecutionContext is bound to the same GPU as the engine from which it was created. When calling execute() or enqueue(), ensure that the thread is associated with the correct device by calling cudaSetDevice() if necessary.

and for the second question:
I recommend that you don’t, however, if you do, you’ll need to follow these guidelines:
The major, minor, and patch versions of TensorRT must match between systems. This ensures you are picking kernels that are still present and have not undergone certain optimizations or bug fixes that would change their behavior.
The CUDA compute capability major and minor versions must match between systems. This ensures that the same hardware features are present so the kernel will not fail to execute. An example would be mixing cards with different precision capabilities.
The following properties should match between systems:
Maximum GPU graphics clock speed
Maximum GPU memory clock speed
GPU memory bus width
Total GPU memory
GPU L2 cache size
SM processor count
Asynchronous engine count
If any of the above properties do not match, you will receive the following warning: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

@guods
Copy link
Author

guods commented Sep 10, 2019

Thanks for you reply. I also read the words in TensorRT document, although the output value of cudaSetDevice()(before create engine) is error, it create the engine and get the correct result, it suggested that it is no use by cudaSetDevice() .

@zerollzeng
Copy link

It should not return error, emmm... what kind of error did you get?

@guods
Copy link
Author

guods commented Sep 10, 2019

I make the error deliberately, I want to know if the engine file is still generated properly even if I set it incorrectly. I set it incorrectly and the file is still generated properly. For ctreating the engine, it
is no use by cudaSetDeviece.

@lewes6369
Copy link
Owner

I am not sure your issues. As @zerollzeng said, the engine is not generic for different architecture cards. Maybe just try to set CUDA_VISIBLE_DEVICES to the value which graphic card you want to create engine and deploy.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants