The latest version is not built on an arm #949

dedseclulz · 2023-04-13T18:31:54Z

I UNAME_S:  Linux
I UNAME_P:  aarch64
I UNAME_M:  aarch64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -mcpu=native
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -mcpu=native
I LDFLAGS:  
I CC:       cc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
I CXX:      g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -mcpu=native   -c ggml.c -o ggml.o
ggml.c: In function ‘ggml_vec_dot_q4_1’:
ggml.c:2347:9: note: use ‘-flax-vector-conversions’ to permit conversions between vectors with differing element types or numbers of subparts
 2347 |         int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0l, v1_0l);
      |         ^~~~~~~~~
ggml.c:2347:51: error: incompatible type for argument 2 of ‘vdotq_s32’
 2347 |         int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0l, v1_0l);
      |                                                   ^~~~~
      |                                                   |
      |                                                   uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:37: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                           ~~~~~~~~~~^~~
ggml.c:2347:58: error: incompatible type for argument 3 of ‘vdotq_s32’
 2347 |         int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0l, v1_0l);
      |                                                          ^~~~~
      |                                                          |
      |                                                          uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:52: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                                          ~~~~~~~~~~^~~
ggml.c:2348:51: error: incompatible type for argument 2 of ‘vdotq_s32’
 2348 |         int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1l, v1_1l);
      |                                                   ^~~~~
      |                                                   |
      |                                                   uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:37: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                           ~~~~~~~~~~^~~
ggml.c:2348:58: error: incompatible type for argument 3 of ‘vdotq_s32’
 2348 |         int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1l, v1_1l);
      |                                                          ^~~~~
      |                                                          |
      |                                                          uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:52: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                                          ~~~~~~~~~~^~~
ggml.c:2350:30: error: incompatible type for argument 2 of ‘vdotq_s32’
 2350 |         p_0 = vdotq_s32(p_0, v0_0h, v1_0h);
      |                              ^~~~~
      |                              |
      |                              uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:37: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                           ~~~~~~~~~~^~~
ggml.c:2350:37: error: incompatible type for argument 3 of ‘vdotq_s32’
 2350 |         p_0 = vdotq_s32(p_0, v0_0h, v1_0h);
      |                                     ^~~~~
      |                                     |
      |                                     uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:52: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                                          ~~~~~~~~~~^~~
ggml.c:2351:30: error: incompatible type for argument 2 of ‘vdotq_s32’
 2351 |         p_1 = vdotq_s32(p_1, v0_1h, v1_1h);
      |                              ^~~~~
      |                              |
      |                              uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:37: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                           ~~~~~~~~~~^~~
ggml.c:2351:37: error: incompatible type for argument 3 of ‘vdotq_s32’
 2351 |         p_1 = vdotq_s32(p_1, v0_1h, v1_1h);
      |                                     ^~~~~
      |                                     |
      |                                     uint8x16_t
In file included from ggml.c:159:
/usr/lib/gcc/aarch64-linux-gnu/11/include/arm_neon.h:32134:52: note: expected ‘int8x16_t’ but argument is of type ‘uint8x16_t’
32134 | vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
      |                                          ~~~~~~~~~~^~~
make: *** [Makefile:143: ggml.o] Error 1

$ lscpu

  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
CPU(s):                 4
  On-line CPU(s) list:  0-3
Vendor ID:              ARM
  Model name:           Neoverse-N1
    Model:              1
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s):          1
    Stepping:           r3p1
    BogoMIPS:           50.00
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp a
                        simdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
NUMA:
  NUMA node(s):         1
  NUMA node0 CPU(s):    0-3
Vulnerabilities:
  Itlb multihit:        Not affected
  L1tf:                 Not affected
  Mds:                  Not affected
  Meltdown:             Not affected
  Mmio stale data:      Not affected
  Retbleed:             Not affected
  Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:           Mitigation; __user pointer sanitization
  Spectre v2:           Mitigation; CSV2, BHB
  Srbds:                Not affected
  Tsx async abort:      Not affected

Operating System, e.g. for Linux:

$ uname -a
Linux 5.15.0-1021-oracle #27-Ubuntu SMP Fri Oct 14 20:04:20 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

SDK version, e.g. for Linux:

$ make --version GNU Make 4.3 $ g++ --version g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

Steps to Reproduce

make

environment info:

commit 0e07e6a8399fd993739a3ba3c6f95f92bfab6f58```

The text was updated successfully, but these errors were encountered:

jasonharrison · 2023-04-13T19:04:24Z

d990e3f broke ARM builds.

Last good commit: 9190e8e

ggerganov · 2023-04-14T06:46:28Z

Should be fixed via 0f07cac

KGOrphanides · 2023-04-17T15:13:19Z

There appears to be a reversion in recent releases, including efd0564
See also #58 (comment) for an earlier report of a similar issue.

I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  armv7l
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread
I LDFLAGS:  
I CC:       cc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110
I CXX:      g++ (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations   -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -c llama.cpp -o llama.o
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -c examples/common.cpp -o common.o
ggml.c: In function ‘quantize_row_q8_0’:
ggml.c:1101:36: warning: implicit declaration of function ‘vcvtnq_s32_f32’; did you mean ‘vcvtq_s32_f32’? [-Wimplicit-function-declaration]
 1101 |             const int32x4_t   vi = vcvtnq_s32_f32(v);
      |                                    ^~~~~~~~~~~~~~
      |                                    vcvtq_s32_f32
ggml.c:1101:36: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int’
ggml.c: In function ‘ggml_vec_dot_q4_0_q8_0’:
ggml.c:2775:34: warning: implicit declaration of function ‘vuzp1q_s8’; did you mean ‘vuzpq_s8’? [-Wimplicit-function-declaration]
 2775 |         const int8x16_t v1_0ls = vuzp1q_s8(v1_0l, v1_0h);
      |                                  ^~~~~~~~~
      |                                  vuzpq_s8
ggml.c:2775:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’
ggml.c:2776:34: warning: implicit declaration of function ‘vuzp2q_s8’; did you mean ‘vuzpq_s8’? [-Wimplicit-function-declaration]
 2776 |         const int8x16_t v1_0hs = vuzp2q_s8(v1_0l, v1_0h);
      |                                  ^~~~~~~~~
      |                                  vuzpq_s8
ggml.c:2776:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’
ggml.c:2777:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’
 2777 |         const int8x16_t v1_1ls = vuzp1q_s8(v1_1l, v1_1h);
      |                                  ^~~~~~~~~
ggml.c:2778:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’
 2778 |         const int8x16_t v1_1hs = vuzp2q_s8(v1_1l, v1_1h);
      |                                  ^~~~~~~~~
make: *** [Makefile:143: ggml.o] Error 1
make: *** Waiting for unfinished jobs....
llama.cpp:65:2: warning: extra ‘;’ [-Wpedantic]
   65 | };
      |  ^
llama.cpp:77:2: warning: extra ‘;’ [-Wpedantic]
   77 | };
      |  ^
llama.cpp:90:2: warning: extra ‘;’ [-Wpedantic]
   90 | };
      |  ^
In file included from /usr/include/c++/10/vector:72,
                 from llama_util.h:16,
                 from llama.cpp:6:
/usr/include/c++/10/bits/vector.tcc: In member function ‘void std::vector<_Tp, _Alloc>::_M_realloc_insert(std::vector<_Tp, _Alloc>::iterator, _Args&& ...) [with _Args = {const double&}; _Tp = double; _Alloc = std::allocator<double>]’:
/usr/include/c++/10/bits/vector.tcc:426:7: note: parameter passing for argument of type ‘std::vector<double>::iterator’ changed in GCC 7.1
  426 |       vector<_Tp, _Alloc>::
      |       ^~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/10/vector:67,
                 from llama_util.h:16,
                 from llama.cpp:6:
/usr/include/c++/10/bits/stl_vector.h: In member function ‘void std::discrete_distribution<_IntType>::param_type::_M_initialize() [with _IntType = int]’:
/usr/include/c++/10/bits/stl_vector.h:1198:21: note: parameter passing for argument of type ‘__gnu_cxx::__normal_iterator<double*, std::vector<double> >’ changed in GCC 7.1
 1198 |    _M_realloc_insert(end(), __x);
      |    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
/usr/include/c++/10/bits/stl_vector.h:1198:21: note: parameter passing for argument of type ‘__gnu_cxx::__normal_iterator<double*, std::vector<double> >’ changed in GCC 7.1
 1198 |    _M_realloc_insert(end(), __x);
      |    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~

This is on a 4GB Raspberry Pi 400.
0f07cac builds fine.

ggerganov closed this as completed Apr 14, 2023

novag mentioned this issue Apr 14, 2023

ggml : fix q4_1 dot product types ggml-org/whisper.cpp#759

Merged

boegel mentioned this issue Mar 13, 2024

{2023.06}[foss/2022b] R v4.2.2 EESSI/software-layer#452

Merged

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The latest version is not built on an arm #949

The latest version is not built on an arm #949

dedseclulz commented Apr 13, 2023

jasonharrison commented Apr 13, 2023

ggerganov commented Apr 14, 2023

KGOrphanides commented Apr 17, 2023 •

edited

Loading

The latest version is not built on an arm #949

The latest version is not built on an arm #949

Comments

dedseclulz commented Apr 13, 2023

Steps to Reproduce

jasonharrison commented Apr 13, 2023

ggerganov commented Apr 14, 2023

KGOrphanides commented Apr 17, 2023 • edited Loading

KGOrphanides commented Apr 17, 2023 •

edited

Loading