Skip to content

Commit d9617aa

Browse files
Merge pull request #65 from Monokaix/dev
migrate vgpu to external project
2 parents 6a108e0 + 5ee11b7 commit d9617aa

33 files changed

+62
-2374
lines changed

.github/workflows/main.yml

-2
Original file line numberDiff line numberDiff line change
@@ -34,5 +34,3 @@ jobs:
3434
- run: make ubuntu20.04
3535
- run: TAG_VERSION="${BRANCH_NAME}" make push-tag
3636
- run: make push-latest
37-
- run: make vgpu
38-
- run: TAG_VERSION="${BRANCH_NAME}" make push-vgpu-tag

Makefile

+3-17
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
1+
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -20,8 +20,7 @@
2020

2121
DOCKER ?= docker
2222
REGISTRY ?= volcanosh
23-
VERSION ?= latest
24-
TAG_VERSION ?= 1.0.0
23+
VERSION ?= 1.0.0
2524

2625
##### Public rules #####
2726

@@ -39,24 +38,11 @@ push-latest:
3938
$(DOCKER) tag "$(REGISTRY)/volcano-device-plugin:$(VERSION)-ubuntu20.04" "$(REGISTRY)/volcano-device-plugin:latest"
4039
$(DOCKER) push "$(REGISTRY)/volcano-device-plugin:latest"
4140

42-
push-tag:
43-
$(DOCKER) tag "$(REGISTRY)/volcano-device-plugin:$(VERSION)-ubuntu20.04" "$(REGISTRY)/volcano-device-plugin:$(TAG_VERSION)"
44-
$(DOCKER) push "$(REGISTRY)/volcano-device-plugin:$(TAG_VERSION)"
45-
46-
push-vgpu-tag:
47-
$(DOCKER) tag "$(REGISTRY)/volcano-vgpu-device-plugin:$(VERSION)-ubuntu20.04" "$(REGISTRY)/volcano-vgpu-device-plugin:$(TAG_VERSION)"
48-
$(DOCKER) push "$(REGISTRY)/volcano-vgpu-device-plugin:$(TAG_VERSION)"
49-
5041
ubuntu20.04:
51-
$(DOCKER) build --pull \
42+
$(DOCKER) build --network=host --pull \
5243
--tag $(REGISTRY)/volcano-device-plugin:$(VERSION)-ubuntu20.04 \
5344
--file docker/amd64/Dockerfile.ubuntu20.04 .
5445

55-
vgpu:
56-
$(DOCKER) build --pull \
57-
--tag $(REGISTRY)/volcano-vgpu-device-plugin:$(VERSION)-ubuntu20.04 \
58-
--file docker/amd64/Dockerfile.vgpu-ubuntu20.04 .
59-
6046
centos7:
6147
$(DOCKER) build --pull \
6248
--tag $(REGISTRY)/volcano-device-plugin:$(VERSION)-centos7 \

README.md

+11-36
Original file line numberDiff line numberDiff line change
@@ -73,41 +73,13 @@ We will be editing the docker daemon config file which is usually present at `/e
7373
Once you have enabled this option on *all* the GPU nodes you wish to use,
7474
you can then enable GPU support in your cluster by deploying the following Daemonset:
7575

76-
VGPU:
77-
```
78-
$ kubectl create -f volcano-vgpu-device-plugin.yml
79-
```
80-
81-
GPU-SHARE (**Will be deprecated in volcano v1.9**):
8276
```shell
8377
$ kubectl create -f volcano-device-plugin.yml
8478
```
8579

8680
**Note** that volcano device plugin can be configured. For example, it can specify gpu strategy by adding in the yaml file ''args: ["--gpu-strategy=number"]'' under ''image: volcanosh/volcano-device-plugin''. More configuration can be found at [volcano device plugin configuration](https://github.com/volcano-sh/devices/blob/master/doc/config.md).
8781

88-
### Running VGPU Jobs
89-
90-
VGPU can be requested by both set "volcano.sh/vgpu-number" and "volcano.sh/vgpu-memory" in resource.limit
91-
92-
```shell script
93-
$ cat <<EOF | kubectl apply -f -
94-
apiVersion: v1
95-
kind: Pod
96-
metadata:
97-
name: gpu-pod1
98-
spec:
99-
containers:
100-
- name: cuda-container
101-
image: nvidia/cuda:9.0-devel
102-
command: ["sleep"]
103-
args: ["100000"]
104-
resources:
105-
limits:
106-
volcano.sh/vgpu-number: 2 # requesting 1 gpu cards
107-
volcano.sh/vgpu-memory: 3000
108-
EOF
109-
```
110-
### Running GPU Sharing Jobs (**Will be deprecated in volcano v1.9**)
82+
### Running GPU Sharing Jobs (Without memory isolation)
11183

11284
NVIDIA GPUs can now be shared via container level resource requirements using the resource name volcano.sh/gpu-memory:
11385

@@ -144,7 +116,7 @@ spec:
144116
> **WARNING:** *if you don't request GPUs when using the device plugin with NVIDIA images all
145117
> the GPUs on the machine will be exposed inside your container.*
146118
147-
### Running GPU Number Jobs (**Will be deprecated in volcano v1.9**)
119+
### Running GPU Number Jobs (Without number isolation)
148120
149121
NVIDIA GPUs can now be requested via container level resource requirements using the resource name volcano.sh/gpu-number:
150122
@@ -170,7 +142,7 @@ EOF
170142

171143
Please note that:
172144
- the device plugin feature is beta as of Kubernetes v1.11.
173-
- the gpu-share device plugin is alpha and is missing the following features, and will be deprecated in volcano v1.9
145+
- the Volcano device plugin is alpha and is missing
174146
- More comprehensive GPU health checking features
175147
- GPU cleanup features
176148
- GPU hard isolation
@@ -180,16 +152,19 @@ The next sections are focused on building the device plugin and running it.
180152

181153
### With Docker
182154

183-
#### Deploy as DaemonSet:
155+
#### Build
156+
```shell
157+
$ make ubuntu20.04.
158+
```
184159

185-
GPU-SHARE:
160+
#### Run locally
186161
```shell
187-
$ kubectl create -f nvidia-device-plugin.yml
162+
$ docker run --security-opt=no-new-privileges --cap-drop=ALL --network=none -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:{version}
188163
```
189164

190-
VGPU:
165+
#### Deploy as DaemonSet:
191166
```shell
192-
$ kubectl create -f nvidia-vgpu-device-plugin.yml
167+
$ kubectl create -f nvidia-device-plugin.yml
193168
```
194169

195170
# Issues and Contributing

cmd/vgpu/main.go

-191
This file was deleted.

cmd/vgpu/watchers.go

-48
This file was deleted.

docker/amd64/Dockerfile.ubuntu20.04

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,4 @@ ENV NVIDIA_DRIVER_CAPABILITIES=utility
4040

4141
COPY --from=build /go/src/volcano.sh/devices/volcano-device-plugin /usr/bin/volcano-device-plugin
4242

43-
ENTRYPOINT ["volcano-device-plugin"]
43+
ENTRYPOINT ["volcano-device-plugin"]

0 commit comments

Comments
 (0)