GPU isolation options

We want to make sure one cannot request more AMD GPU than he should by using certain environment variables. (e.g. `HIP_VISIBLE_DEVICES` / `ROCR_VISIBLE_DEVICES`).
I am not sure whether this is an issue as of today, we cannot verify this since we don't have a box with more than one AMD GPU at the present time.

To bring more clarity, it is possible to expose access to all NVIDIA GPU on the Host via `NVIDIA_VISIBLE_DEVICES=all` env. variable set to the Pod. Luckily, we were able to work it around by setting `--set deviceListStrategy=volume-mounts` for `nvdp/nvidia-device-plugin` helm chart along with these configs in `/etc/nvidia-container-runtime/config.toml` file:

```
accept-nvidia-visible-devices-as-volume-mounts = true
accept-nvidia-visible-devices-envvar-when-unprivileged = false
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU isolation options #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU isolation options #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions