Skip to content

Bug: GGML assert with bf16, RTX3090 #8234

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
micsthepick opened this issue Jul 1, 2024 · 1 comment
Closed

Bug: GGML assert with bf16, RTX3090 #8234

micsthepick opened this issue Jul 1, 2024 · 1 comment
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

Comments

@micsthepick
Copy link

micsthepick commented Jul 1, 2024

What happened?

./llama-server -ngl 99 -cb -c 65536 -np 32 -m models/Phi-3-mini-128k-instruct/ggml-model-bf16.gguf 
...
GGML_ASSERT: ggml/src/ggml-cuda.cu:1257: to_fp32_cuda != nullptr
[New LWP 934430]
[New LWP 934432]
[New LWP 934433]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fb1ba523c7f in __GI___wait4 (pid=934542, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0  0x00007fb1ba523c7f in __GI___wait4 (pid=934542, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27      in ../sysdeps/unix/sysv/linux/wait4.c
#1  0x0000559119a6c7eb in ggml_print_backtrace ()
#2  0x000055911992c1b5 in ggml_cuda_op_mul_mat_cublas(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char const*, float const*, char const*, float*, long, long, long, long, CUstream_st*) ()
#3  0x000055911992e781 in ggml_cuda_op_mul_mat(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, void (*)(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char const*, float const*, char const*, float*, long, long, long, long, CUstream_st*), void (*)(float const*, void*, long, long, long, long, ggml_type, CUstream_st*)) ()
#4  0x000055911992f7a5 in ggml_cuda_mul_mat(ggml_backend_cuda_context&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*) ()
#5  0x0000559119933cff in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) ()
#6  0x0000559119abb4bb in ggml_backend_sched_graph_compute_async ()
#7  0x0000559119b0d7b0 in llama_decode ()
#8  0x0000559119bcd039 in llama_init_from_gpt_params(gpt_params&) ()
#9  0x0000559119c78495 in server_context::load_model(gpt_params const&) ()
#10 0x0000559119913d7a in main ()
[Inferior 1 (process 934429) detached]
./start_phi.sh: line 1: 934429 Aborted 

Name and Version

./llama-server --version
version: 3265 (72272b8)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux, Windows

Relevant log output

No response

@micsthepick micsthepick added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Jul 1, 2024
@micsthepick micsthepick changed the title Bug: GGML assert Bug: GGML assert with bf16, RTX3090 Jul 1, 2024
@bfroemel
Copy link

bfroemel commented Jul 1, 2024

duplicate of #7211

@micsthepick micsthepick closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
Projects
None yet
Development

No branches or pull requests

2 participants