Option --gpus failes with AMD and Intel GPUs #2063

mviereck · 2019-08-23T14:20:12Z

Description
New option --gpus to provide the GPU to containers failes with AMD and Intel GPUs.

It is specific to systems with NVIDIA GPU and NVIDIA's proprietary driver and NVIDIA's
container runtime setup.

A cli option should be general and not be vendor-specific.

Steps to reproduce the issue:
On a system with an AMD GPU:

$ docker run --rm --gpus all debian echo
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Describe the results you received:
Option --gpus all fails on a system with an AMD GPU.
Likely it also fails with Intel GPUs and with NVIDIA GPUs using the nouveau driver.

Describe the results you expected:
Option --gpus all should provide the GPU to the container.

Output of docker --version:
Docker version 19.03.1, build 74b1e89

Discussion
Coming from #1200, #1714 , opening a new ticket:
I want to analyze the current state of --gpus and make proposals:

Either make GPU support vendor-specific to NVIDIA with docker plugin install and drop the cli option --gpus.
Or make --gpus work in general for all vendors. I would prefer that.

The current state:

--gpus works with a specific NVIDIA setup only. Dependencies:
- NVIDIA GPU
- NVIDIA proprietary driver on host
- nvidia-container-toolkit on host
- nvidia/nvidia-docker image.
--gpus fails with NVIDIA GPUs not fulfilling above dependencies.
--gpus fails with AMD GPUs.
--gpus fails with Intel GPUs.
--gpus fails with NVIDIA GPUs and nouveau.
Unknown/to check: Does --gpus work with the combination:
- NVIDIA GPU
- NVIDIA proprietary driver on host
- same NVIDIA proprietary driver in container (arbitrary image)
- Without nvidia-container-toolkit on host

Desirable:

Support of other vendors and nouveau and possible existing NVIDIA driver in container. ToDo:
- --gpus should provide /dev/dri to container.
- --gpus should provide /dev/nvidia* to container.
- --gpus maybe should provide /dev/vga_arbiter to container.
- --gpus maybe should add the container user to groups video and render to support unprivileged users in container.

At least for --gpus all or e.g. --gpus intel this should not be too hard.

Maybe additional, or to be done by the user:

Driver check on host and in container by docker.
Driver installation in container by docker. (Probably going too far. However, possible e.g. with NVIDIA's proprietary driver. I am not sure how to accomplish this with MESA drivers, but likely possible, too.)

The text was updated successfully, but these errors were encountered:

tiborvass · 2019-08-23T17:44:06Z

@mviereck thanks, it was always meant to be vendor-neutral. If you want to contribute to the moby/moby repo to add support for other GPU vendors, you are more than welcome! We would need expertise from the various GPU vendors which I don't have. @RenaudWasTaken was providing NVIDIA expertise hence the first implementation.

mviereck · 2019-08-23T20:11:57Z

If you want to contribute to the moby/moby repo to add support for other GPU vendors, you are more than welcome! We would need expertise from the various GPU vendors which I don't have.

Thanks! I am not experienced with Go, though, so I cannot contribute directly with code.
However, I have some experience in providing GPUs of all vendors to docker containers.
My project x11docker allows to run GUI applications in docker containers, optionally with GPU hardware acceleration.

This works quite well with MESA drivers on host and in image, even if host and container system are quite different (e.g. debian host and alpine image). Given MESA is installed, all you need is described above:

--gpus should provide /dev/dri to container.

--gpus should provide /dev/nvidia* to container.

--gpus maybe should provide /dev/vga_arbiter to container.

--gpus maybe should add the container user to groups video and render to support unprivileged users in container.

However, this does not cover additional setups like limited GPU memory access. That would indeed need expertise from the vendors. But the probably most common --gpus all would be served.

These packages cover all MESA packages needed in their dependencies:

	Debian and Ubuntu	Arch	Fedora	Alpine
OpenGL	`mesa-utils` `mesa-utils-extra`	`mesa-demos`	`glx-utils` `mesa-dri-drivers`	`mesa-demos` `mesa-dri-ati` `mesa-dri-intel` `mesa-dri-nouveau` `mesa-dri-swrast`
Video decoding support	`libxv1` `va-driver-all`	`libxv` `libva` `libva-intel-driver` `libva-vdpau-driver`	`libXv` `libva` `libva-intel-hybrid-driver` `libva-vdpau-driver`	`libxv` `libva` `libva-glx` `libva-intel-driver` `libva-vdpau-driver`

Alex031544 · 2020-06-27T17:17:01Z

You may have a look at this article: The AMD Deep Learning Stack Using Docker by Sam Tehrani

dimagoltsman · 2021-02-16T14:08:53Z

is there any progress with it?

thaJeztah · 2021-02-16T15:26:31Z

I don't think contributions have been made to provide support for other GPU vendors, so no change

MithunKinarullathil · 2022-03-06T13:24:24Z

Checking in, any progress on this? Thank you.

thaJeztah · 2022-03-19T08:52:26Z

No, no progress (that I'm aware of)

w8jcik · 2022-07-25T23:39:41Z

It doesn't need to support other vendors, but it should not be crashing like this

$ docker run --gpus all -it ubuntu:20.04
Unable to find image 'ubuntu:20.04' locally
20.04: Pulling from library/ubuntu
Digest: sha256:fd92c36d3cb9b1d027c4d2a72c6bf0125da82425fc2ca37c414d4f010180dc19
Status: Downloaded newer image for ubuntu:20.04
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0003] error waiting for container: context canceled

Also, look at this error message. It doesn't say Nvidia GPU not detected. It says ERR[0003]. This is not user friendly.

michikite · 2022-09-21T21:34:20Z

Facing the same error, using an eGPU that is turned on and off when needed. It would be highly desirable if the container would start without the eGPU being present.

risharde · 2023-09-28T06:43:40Z

And this is still an issue smh

slmlm2009 · 2023-10-11T15:43:53Z

any update?

Docker is a masive project.
Basic hardware support is required.

cpuguy83 · 2023-10-11T16:14:22Z

@slmlm2009 Docker is used by a lot of people, but so far no one has contributed anything towards making this work.
Note that nvidia are the ones that did the work to make --gpus work with nvidia hardware.

Another thing to note: docker v25 will ship with CDI support which may end up being the best place to handle gpus all around.

Daasin · 2024-06-05T15:44:47Z

So, only Nvidia GPU's work with Docker?

cpuguy83 · 2024-06-05T16:25:09Z

docker ships with support for CDI now which allows for anyone to add support for any type of device via the --device flag.
https://docs.docker.com/reference/cli/dockerd/#enable-cdi-devices

I don't think we'll be extending the gpus flag beyond the current scope unless someone really wants to add support (which to date no one has submitted anything). Even if they really wanted to, I think I'd probably point them to integrate with CDI.

Closing this for now.

simaotwx · 2024-09-09T13:29:23Z

docker ships with support for CDI now which allows for anyone to add support for any type of device via the --device flag. https://docs.docker.com/reference/cli/dockerd/#enable-cdi-devices

[...] I think I'd probably point them to integrate with CDI.

CDI is not the solution (yet):

This is experimental feature and as such doesn't represent a stable API.

This feature isn't enabled by default. To this feature, set features.cdi to true in the daemon.json configuration file.

As long as it's experimental, it's not a proper solution.

GordonTheTurtle added the area/plugins label Aug 23, 2019

This was referenced Aug 23, 2019

container: --gpus support #1714

Merged

[RFC] GPU support in CLI #1200

Closed

Support for Docker's --gpus option mviereck/x11docker#180

Closed

thaJeztah mentioned this issue Feb 13, 2020

[Proposal] Use accelerator in docker container moby/moby#28642

Closed

westernheld mentioned this issue Apr 24, 2020

README needs info for running with ROCm on AMD GPUs FoldingAtHome/containers#3

Closed

MithunKinarullathil mentioned this issue Mar 6, 2022

Feat: remove nvidia dependencies MOV-AI/movai-flow#57

Closed

adamnovak mentioned this issue Sep 15, 2022

CWL: Implement the new cwltool:CUDARequirement resource requirement DataBiosphere/toil#3982

Closed

mojothemonkey2 mentioned this issue Dec 28, 2022

[Feature Request] Adding user to device group by default IceWhaleTech/CasaOS#785

Open

CodiTrixta mentioned this issue Jan 11, 2023

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. NVIDIA/nvidia-docker#1243

Closed

9 tasks

BenTheElder mentioned this issue May 30, 2023

Support GPUs kubernetes-sigs/kind#3164

Open

WaysonWei mentioned this issue Mar 18, 2024

Label GPU functionality as NVIDIA only portainer/portainer#11413

Closed

cpuguy83 closed this as completed Jun 5, 2024

BuyMyMojo mentioned this issue Oct 15, 2024

Completely black video output [Beta 6.0.0-3 + docker fix pull request] k4yt3x/video2x#1187

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option --gpus failes with AMD and Intel GPUs #2063

Option --gpus failes with AMD and Intel GPUs #2063

mviereck commented Aug 23, 2019

tiborvass commented Aug 23, 2019

mviereck commented Aug 23, 2019 •

edited

Loading

Alex031544 commented Jun 27, 2020

dimagoltsman commented Feb 16, 2021

thaJeztah commented Feb 16, 2021

MithunKinarullathil commented Mar 6, 2022 •

edited

Loading

thaJeztah commented Mar 19, 2022

w8jcik commented Jul 25, 2022

michikite commented Sep 21, 2022

risharde commented Sep 28, 2023

slmlm2009 commented Oct 11, 2023

cpuguy83 commented Oct 11, 2023

Daasin commented Jun 5, 2024

cpuguy83 commented Jun 5, 2024

simaotwx commented Sep 9, 2024

Option --gpus failes with AMD and Intel GPUs #2063

Option --gpus failes with AMD and Intel GPUs #2063

Comments

mviereck commented Aug 23, 2019

tiborvass commented Aug 23, 2019

mviereck commented Aug 23, 2019 • edited Loading

Alex031544 commented Jun 27, 2020

dimagoltsman commented Feb 16, 2021

thaJeztah commented Feb 16, 2021

MithunKinarullathil commented Mar 6, 2022 • edited Loading

thaJeztah commented Mar 19, 2022

w8jcik commented Jul 25, 2022

michikite commented Sep 21, 2022

risharde commented Sep 28, 2023

slmlm2009 commented Oct 11, 2023

cpuguy83 commented Oct 11, 2023

Daasin commented Jun 5, 2024

cpuguy83 commented Jun 5, 2024

simaotwx commented Sep 9, 2024

mviereck commented Aug 23, 2019 •

edited

Loading

MithunKinarullathil commented Mar 6, 2022 •

edited

Loading