-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
RuntimeError: synStatus=26 [Generic failure] Device acquire failed. #1611
Comments
@VinayHN1365466 It looks like your devices are already busy or are somehow unavailable. Can you run |
no process are running |
Can you try adding |
Thanks Regisss, I tried with --privileged with Docker, its still the same error docker run --privileged -it --name optimum_118_8cards_vinay_new_1234 --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host -v /mode_file/:/root/.cache/ -v /optimum-habana:/root/optimum-habana vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:lates |
Can you paste here the complete logs you're getting? |
~/optimum-habana/examples/text-generation# python run_generation.py |
Does running import torch
import habana_frameworks.torch.hpu
a = torch.tensor(1, device="hpu") work? |
I got the same error |
Can you reboot this instance? |
Sorry, I don't have access to reboot the instance :( |
@VinayHN1365466 can you capture dmesg -T ? thanks. |
On some cloud machine, you need to add sudo to watch all process of other users. |
I rebooted the instance, but its still the same issue :( |
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Execute Successfully
The text was updated successfully, but these errors were encountered: