Releases · 3Simplex/llama.cpp

03 Dec 16:27

3b4f2e3

b4248

llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)

Assets 22

25 Nov 16:58

github-actions

b4164

9ca2e67

b4164

server : add speculative decoding support (#10455)

* server : add speculative decoding support

ggml-ci

* server : add helper function slot.can_speculate()

ggml-ci

Assets 21

22 Nov 14:32

github-actions

b4153

6dfcfef

b4153

ci: Update oneAPI runtime dll packaging (#10428)

This is the minimum runtime dll dependencies for oneAPI 2025.0

Assets 22

20 Nov 20:51

github-actions

b4145

9abe9ee

b4145

vulkan: predicate max operation in soft_max shaders/soft_max (#10437)

Fixes #10434

Assets 21

19 Nov 15:20

github-actions

b4132

3ee6382

b4132

cuda : fix CUDA_FLAGS not being applied (#10403)

Assets 21

18 Nov 17:14

github-actions

b4125

531cb1c

b4125

Skip searching root path for cross-compile builds (#10383)

Assets 21

16 Nov 18:29

github-actions

b4100

bcdb7a2

b4100

server: (web UI) Add samplers sequence customization (#10255)

* Samplers sequence: simplified and input field.

* Removed unused function

* Modify and use `settings-modal-short-input`

* rename "name" --> "label"

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

Assets 21

12 Nov 14:48

github-actions

b4067

54ef9cf

b4067

vulkan: Throttle the number of shader compiles during the build step.…

Assets 22

09 Nov 16:16

github-actions

b4061

6423c65

b4061

metal : reorder write loop in mul mat kernel + style (#10231)

* metal : reorder write loop

* metal : int -> short, style

ggml-ci

Assets 22

07 Nov 17:15

github-actions

b4042

5107e8c

b4042

DRY: Fixes clone functionality (#10192)

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: 3Simplex/llama.cpp

b4248

b4164

b4153

b4145

b4132

b4125

b4100

b4067

b4061

b4042