-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Option --gpus failes with AMD and Intel GPUs #2063
Comments
@mviereck thanks, it was always meant to be vendor-neutral. If you want to contribute to the moby/moby repo to add support for other GPU vendors, you are more than welcome! We would need expertise from the various GPU vendors which I don't have. @RenaudWasTaken was providing NVIDIA expertise hence the first implementation. |
Thanks! I am not experienced with Go, though, so I cannot contribute directly with code. This works quite well with MESA drivers on host and in image, even if host and container system are quite different (e.g. debian host and alpine image). Given MESA is installed, all you need is described above:
However, this does not cover additional setups like limited GPU memory access. That would indeed need expertise from the vendors. But the probably most common These packages cover all MESA packages needed in their dependencies:
|
You may have a look at this article: The AMD Deep Learning Stack Using Docker by Sam Tehrani |
is there any progress with it? |
I don't think contributions have been made to provide support for other GPU vendors, so no change |
Checking in, any progress on this? Thank you. |
No, no progress (that I'm aware of) |
It doesn't need to support other vendors, but it should not be crashing like this $ docker run --gpus all -it ubuntu:20.04
Unable to find image 'ubuntu:20.04' locally
20.04: Pulling from library/ubuntu
Digest: sha256:fd92c36d3cb9b1d027c4d2a72c6bf0125da82425fc2ca37c414d4f010180dc19
Status: Downloaded newer image for ubuntu:20.04
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0003] error waiting for container: context canceled Also, look at this error message. It doesn't say Nvidia GPU not detected. It says |
Facing the same error, using an eGPU that is turned on and off when needed. It would be highly desirable if the container would start without the eGPU being present. |
And this is still an issue smh |
any update? Docker is a masive project. |
@slmlm2009 Docker is used by a lot of people, but so far no one has contributed anything towards making this work. Another thing to note: docker v25 will ship with CDI support which may end up being the best place to handle gpus all around. |
So, only Nvidia GPU's work with Docker? |
docker ships with support for CDI now which allows for anyone to add support for any type of device via the I don't think we'll be extending the gpus flag beyond the current scope unless someone really wants to add support (which to date no one has submitted anything). Even if they really wanted to, I think I'd probably point them to integrate with CDI. Closing this for now. |
CDI is not the solution (yet):
As long as it's experimental, it's not a proper solution. |
Description
New option
--gpus
to provide the GPU to containers failes with AMD and Intel GPUs.It is specific to systems with NVIDIA GPU and NVIDIA's proprietary driver and NVIDIA's
container runtime setup.
A cli option should be general and not be vendor-specific.
Steps to reproduce the issue:
On a system with an AMD GPU:
Describe the results you received:
Option
--gpus all
fails on a system with an AMD GPU.Likely it also fails with Intel GPUs and with NVIDIA GPUs using the
nouveau
driver.Describe the results you expected:
Option
--gpus all
should provide the GPU to the container.Output of
docker --version
:Docker version 19.03.1, build 74b1e89
Discussion
Coming from #1200, #1714 , opening a new ticket:
I want to analyze the current state of
--gpus
and make proposals:docker plugin install
and drop the cli option--gpus
.--gpus
work in general for all vendors. I would prefer that.The current state:
--gpus
works with a specific NVIDIA setup only. Dependencies:--gpus
fails with NVIDIA GPUs not fulfilling above dependencies.--gpus
fails with AMD GPUs.--gpus
fails with Intel GPUs.--gpus
fails with NVIDIA GPUs andnouveau
.--gpus
work with the combination:Desirable:
nouveau
and possible existing NVIDIA driver in container. ToDo:--gpus
should provide/dev/dri
to container.--gpus
should provide/dev/nvidia*
to container.--gpus
maybe should provide/dev/vga_arbiter
to container.--gpus
maybe should add the container user to groupsvideo
andrender
to support unprivileged users in container.At least for
--gpus all
or e.g.--gpus intel
this should not be too hard.Maybe additional, or to be done by the user:
The text was updated successfully, but these errors were encountered: