Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

EKS Support #40

Open
aldmbmtl opened this issue Jul 24, 2023 · 1 comment
Open

EKS Support #40

aldmbmtl opened this issue Jul 24, 2023 · 1 comment

Comments

@aldmbmtl
Copy link

aldmbmtl commented Jul 24, 2023

Hello!

I am trying to get this to work on EKS. Sadly the device plugin doesn't seem to see the GPU. I am using g4ad's and I made a custom AMI running the latest version of the AMD GPU Pro drivers that I could get from Amazon (20.20). When scaling from zero, the cluster autoscaler isn't detecting the resource "amd.com/gpu: 1", but I don't think that will solve this other issue.

When I launch a node and then deploy the device plugin, the pod still won't be scheduled to the node. Any idea as to why?

I0724 03:19:09.793086       1 main.go:305] ./k8s-device-plugin version v1.18.1-21-g2e5bbc7
I0724 03:19:09.793089       1 main.go:305] hwloc: _VERSION: 2.9.1, _API_VERSION: 0x00020800, _COMPONENT_ABI: 7, Runtime: 0x00020800
I0724 03:19:09.793105       1 manager.go:42] Starting device plugin manager
I0724 03:19:09.793108       1 manager.go:46] Registering for system signal notifications
I0724 03:19:09.793346       1 manager.go:52] Registering for notifications of filesystem changes in device plugin directory
I0724 03:19:09.793400       1 manager.go:60] Starting Discovery on new plugins
I0724 03:19:09.793416       1 manager.go:66] Handling incoming signals```

This is the log from the device plugin manager. I assume I should be seeing something else? We would love to get off of Nvidia for our containerized workstations, but this has been blocking us. I assume it is because AWS doesn't seem to want to support Radeon :disappointed: 

 Thanks!
@PierreJiji
Copy link

PierreJiji commented Oct 11, 2024

Is there any update on this?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants