You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
./llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA T1000 8GB (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | matrix cores: none
version: 4534 (955a6c2)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
gdb llama-cli
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty"for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration"for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type"help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from llama-cli...
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/bin/llama-cli
(No debugging symbols found in llama-cli)
(gdb) run -m ../../../../models/unsloth-llama3.2-1b-finetune-function-calling-v3.Q4_K_M.gguf
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA T1000 8GB (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | matrix cores: none
[New Thread 0x7ffff6a006c0 (LWP 21554)]
build: 4534 (955a6c2d) with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_load_from_file_impl: using device Vulkan0 (NVIDIA T1000 8GB) - 8192 MiB free
llama_model_loader: loaded meta data with 30 key-value pairs and 147 tensors from ../../../../models/unsloth-llama3.2-1b-finetune-function-calling-v3.Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Llama 3.2 1b Instruct Bnb 4bit
llama_model_loader: - kv 3: general.organization str = Unsloth
llama_model_loader: - kv 4: general.finetune str = instruct-bnb-4bit
llama_model_loader: - kv 5: general.basename str = llama-3.2
llama_model_loader: - kv 6: general.size_label str = 1B
llama_model_loader: - kv 7: llama.block_count u32 = 16
llama_model_loader: - kv 8: llama.context_length u32 = 131072
llama_model_loader: - kv 9: llama.embedding_length u32 = 2048
llama_model_loader: - kv 10: llama.feed_forward_length u32 = 8192
llama_model_loader: - kv 11: llama.attention.head_count u32 = 32
llama_model_loader: - kv 12: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 13: llama.rope.freq_base f32 = 500000.000000
llama_model_loader: - kv 14: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 15: llama.attention.key_length u32 = 64
llama_model_loader: - kv 16: llama.attention.value_length u32 = 64
llama_model_loader: - kv 17: general.file_type u32 = 15
llama_model_loader: - kv 18: llama.vocab_size u32 = 128256
llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 64
llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe
llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009llama_model_loader: - kv 27: tokenizer.ggml.padding_token_id u32 = 128004llama_model_loader: - kv 28: tokenizer.chat_template str = {{- bos_token }}\n{%- if custom_tools ...llama_model_loader: - kv 29: general.quantization_version u32 = 2llama_model_loader: - type f32: 34 tensorsllama_model_loader: - type q4_K: 96 tensorsllama_model_loader: - type q6_K: 17 tensorsprint_info: file format = GGUF V3 (latest)print_info: file type = Q4_K - Mediumprint_info: file size = 762.81 MiB (5.18 BPW) load: special tokens cache size = 256load: token to piece cache size = 0.7999 MBprint_info: arch = llamaprint_info: vocab_only = 0print_info: n_ctx_train = 131072print_info: n_embd = 2048print_info: n_layer = 16print_info: n_head = 32print_info: n_head_kv = 8print_info: n_rot = 64print_info: n_swa = 0print_info: n_embd_head_k = 64print_info: n_embd_head_v = 64print_info: n_gqa = 4print_info: n_embd_k_gqa = 512print_info: n_embd_v_gqa = 512print_info: f_norm_eps = 0.0e+00print_info: f_norm_rms_eps = 1.0e-05print_info: f_clamp_kqv = 0.0e+00print_info: f_max_alibi_bias = 0.0e+00print_info: f_logit_scale = 0.0e+00print_info: n_ff = 8192print_info: n_expert = 0print_info: n_expert_used = 0print_info: causal attn = 1print_info: pooling type = 0print_info: rope type = 0print_info: rope scaling = linearprint_info: freq_base_train = 500000.0print_info: freq_scale_train = 1print_info: n_ctx_orig_yarn = 131072print_info: rope_finetuned = unknownprint_info: ssm_d_conv = 0print_info: ssm_d_inner = 0print_info: ssm_d_state = 0print_info: ssm_dt_rank = 0print_info: ssm_dt_b_c_rms = 0print_info: model type = 1Bprint_info: model params = 1.24 Bprint_info: general.name = Llama 3.2 1b Instruct Bnb 4bitprint_info: vocab type = BPEprint_info: n_vocab = 128256print_info: n_merges = 280147print_info: BOS token = 128000 '<|begin_of_text|>'print_info: EOS token = 128009 '<|eot_id|>'print_info: EOT token = 128009 '<|eot_id|>'print_info: EOM token = 128008 '<|eom_id|>'print_info: PAD token = 128004 '<|finetune_right_pad_id|>'print_info: LF token = 128 'Ä'print_info: EOG token = 128008 '<|eom_id|>'print_info: EOG token = 128009 '<|eot_id|>'print_info: max token length = 256Thread 1 "llama-cli" received signal SIGSEGV, Segmentation fault.Download failed: Invalid argument. Continuing without source file ./string/../sysdeps/x86_64/multiarch/strlen-evex-base.S.__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81warning: 81 ../sysdeps/x86_64/multiarch/strlen-evex-base.S: No such file or directory(gdb) bt#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81#1 0x00007ffff6ca2b0c in ggml_vk_get_device(unsigned long) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so#2 0x00007ffff6ca3d27 in ggml_backend_vk_host_buffer_type () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so#3 0x00007ffff7f035d7 in llama_model::load_tensors(llama_model_loader&) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so#4 0x00007ffff7e8c53b in llama_model_load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, llama_model&, llama_model_params&) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so#5 0x00007ffff7e91d3c in llama_model_load_from_file_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, llama_model_params) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so#6 0x00007ffff7e9200a in llama_model_load_from_file () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so#7 0x00005555555cc2fc in common_init_from_params(common_params&) ()#8 0x000055555557458c in main ()
Problem description & steps to reproduce
I am trying to run llama-cli on ubuntu 24.04 machine. but with -DGGML_VULKAN=ON it is crashing.
Using vulkan sdk 1.3.296.
Thread 1 "llama-cli" received signal SIGSEGV, Segmentation fault.
Download failed: Invalid argument. Continuing without source file ./string/../sysdeps/x86_64/multiarch/strlen-evex-base.S.
__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81
warning: 81 ../sysdeps/x86_64/multiarch/strlen-evex-base.S: No such file or directory
(gdb) bt
#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81 #1 0x00007ffff6ca2b0c in ggml_vk_get_device(unsigned long) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so #2 0x00007ffff6ca3d27 in ggml_backend_vk_host_buffer_type () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so #3 0x00007ffff7f035d7 in llama_model::load_tensors(llama_model_loader&) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so #4 0x00007ffff7e8c53b in llama_model_load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >&, llama_model&, llama_model_params&) ()
from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so #5 0x00007ffff7e91d3c in llama_model_load_from_file_impl(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >&, llama_model_params) ()
from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so #6 0x00007ffff7e9200a in llama_model_load_from_file () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so #7 0x00005555555cc2fc in common_init_from_params(common_params&) () #8 0x000055555557458c in main ()
Steps:
I have compiled llama-cpp with GGML-VULKAN enabled.
Set up the environment using Vulkan sdk 1.3.296
when I am running llama-cli or any other example like llama-simple-chat, I am observing crash.
Without vulkan it works fine.
First Bad Commit
No response
Relevant log output
Thread 1 "llama-cli" received signal SIGSEGV, Segmentation fault.
Download failed: Invalid argument. Continuing without source file ./string/../sysdeps/x86_64/multiarch/strlen-evex-base.S.
__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81
warning: 81 ../sysdeps/x86_64/multiarch/strlen-evex-base.S: No such file or directory
(gdb) bt
#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81#1 0x00007ffff6ca2b0c in ggml_vk_get_device(unsigned long) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so#2 0x00007ffff6ca3d27 in ggml_backend_vk_host_buffer_type () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so#3 0x00007ffff7f035d7 in llama_model::load_tensors(llama_model_loader&) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so#4 0x00007ffff7e8c53b in llama_model_load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, llama_model&, llama_model_params&) ()
from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so
#5 0x00007ffff7e91d3c in llama_model_load_from_file_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, llama_model_params) ()
from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so
#6 0x00007ffff7e9200a in llama_model_load_from_file () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so#7 0x00005555555cc2fc in common_init_from_params(common_params&) ()#8 0x000055555557458c in main ()
The text was updated successfully, but these errors were encountered:
Can you compile in debug mode (For CMake: -DCMAKE_BUILD_TYPE=Debug)? Then reproduce the crash with gdb again and it should show the specific line that crashed.
Name and Version
./llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA T1000 8GB (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | matrix cores: none
version: 4534 (955a6c2)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
Problem description & steps to reproduce
I am trying to run llama-cli on ubuntu 24.04 machine. but with -DGGML_VULKAN=ON it is crashing.
Using vulkan sdk 1.3.296.
Thread 1 "llama-cli" received signal SIGSEGV, Segmentation fault.
Download failed: Invalid argument. Continuing without source file ./string/../sysdeps/x86_64/multiarch/strlen-evex-base.S.
__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81
warning: 81 ../sysdeps/x86_64/multiarch/strlen-evex-base.S: No such file or directory
(gdb) bt
#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81
#1 0x00007ffff6ca2b0c in ggml_vk_get_device(unsigned long) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so
#2 0x00007ffff6ca3d27 in ggml_backend_vk_host_buffer_type () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/ggml/src/ggml-vulkan/libggml-vulkan.so
#3 0x00007ffff7f035d7 in llama_model::load_tensors(llama_model_loader&) () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so
#4 0x00007ffff7e8c53b in llama_model_load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >&, llama_model&, llama_model_params&) ()
from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so
#5 0x00007ffff7e91d3c in llama_model_load_from_file_impl(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >&, llama_model_params) ()
from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so
#6 0x00007ffff7e9200a in llama_model_load_from_file () from /home/dcaimlpune/ashwini_wp/Chat/externals/llama.cpp/buildx86_vulkan/src/libllama.so
#7 0x00005555555cc2fc in common_init_from_params(common_params&) ()
#8 0x000055555557458c in main ()
Steps:
I have compiled llama-cpp with GGML-VULKAN enabled.
Set up the environment using Vulkan sdk 1.3.296
when I am running llama-cli or any other example like llama-simple-chat, I am observing crash.
Without vulkan it works fine.
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: