You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a chance I'm doing something incorrect here, and if so I'd love to better understand what. But, as of currently, I cannot get llama.cpp to successfully run with SYCL on my A770. It detects the GPU, and begins to load, but just "hangs" with the last log being "[SYCL] call ggml_backend_sycl_host_buffer_type".
When it is like this, a single CPU core is pegged at 100% from the process.
I've left it like this for hours, and it never progresses.
Turns out, the issue I was hitting is a Linux kernel-level regression. Your question about the driver version was helpful, as it ultimately led me to this open issue: intel/compute-runtime#726
Downgrading from 6.8.7 kernel to 6.8.4 resolved the issue I was experiencing.
There is a chance I'm doing something incorrect here, and if so I'd love to better understand what. But, as of currently, I cannot get llama.cpp to successfully run with SYCL on my A770. It detects the GPU, and begins to load, but just "hangs" with the last log being "[SYCL] call ggml_backend_sycl_host_buffer_type".
When it is like this, a single CPU core is pegged at 100% from the process.
I've left it like this for hours, and it never progresses.
The full logs are:
Steps the reproduce:
Intel Arc A770
Debian 12 / 6.8.7-zabbly+ kernel
Running release b2749 in Docker, using the following Dockerfile:
Executed with the following command:
I then run the following command:
/app/llama.cpp-b2749/build/bin/main -m /models/llama-2-7b.Q4_0.gguf -i -ngl -1
The text was updated successfully, but these errors were encountered: