Why server is not provided with binaries ? #1578

mirek190 · 2023-05-23T22:06:42Z

Question ... why server is not provided with binaries ?

BarfingLemurs · 2023-05-24T02:51:51Z

there are: https://github.com/ggerganov/llama.cpp/releases/tag/master-7d87381

but you need to build the arm versions yourself though.

noprotocolunit · 2023-05-24T05:49:49Z

Perhaps I'm missing something, but I downloaded llama-master-7d87381-bin-win-openblas-x64.zip and the server executable is not among them. Am I doing it wrong?

maddes8cht · 2023-05-24T12:05:42Z

Still there is no server.exe in none of the provided binaries, in none of the latest releases.

mirek190 · 2023-05-24T12:31:18Z

Is not present . That's why I'm asking l.
I can build by myself but still ...

FSSRepo · 2023-05-25T16:49:03Z

The LLAMA_BUILD_SERVER option is OFF by default.

mirek190 · 2023-05-25T18:12:28Z

..yes.. but why?

jp-aa · 2023-05-26T11:06:19Z

I don't mean to be annoying, but the server doesn't get built by default when using make, nor appears on the Makefile either ( i'm not really good with build systems and stuff). It does seem to be included when building with cmake, but passing the NVCC flags I need to cmake isn't working for me. Is there a reason why it's not getting built by default? Or at least included in the Makefile?

KerfuffleV2 · 2023-05-26T14:32:19Z

Looks like you can only build the server via cmake.

From poking around a bit, it's actually quite large compared to all the other examples and includes a copy of a HTTP and JSON parsing library. I'd guess it's disabled by default because it's relatively slow to compile.

Just for comparison, the main example (basically what people call llama.cpp) is around 25k with about another 33k of common code which is shared with all the other examples. For the server, the JSON parsing and HTTP support come to about 1MB of source code.

This answer doesn't necessarily help someone that just wants that feature but hopefully at least makes it seem like less of an arbitrary decision.

jp-aa · 2023-05-26T21:08:21Z

Looks like you can only build the server via cmake.

From poking around a bit, it's actually quite large compared to all the other examples and includes a copy of a HTTP and JSON parsing library. I'd guess it's disabled by default because it's relatively slow to compile.

Just for comparison, the main example (basically what people call llama.cpp) is around 25k with about another 33k of common code which is shared with all the other examples. For the server, the JSON parsing and HTTP support come to about 1MB of source code.

This answer doesn't necessarily help someone that just wants that feature but hopefully at least makes it seem like less of an arbitrary decision.

Well it's not that bad. I managed to compile the server with make just by adding "server" to the default target, clean target, and an entry in the examples section of the Makefile which looks the same as most of the stuff in there.
My only problem now is offloading model layers to the GPU while running the server binary. I have NO IDEA if it's even a feature yet. If someone knows how please tell me

SlyEcho · 2023-05-26T21:51:58Z

It needs to be added to the workflow, nothing special.

KerfuffleV2 mentioned this issue May 27, 2023

Include server in releases + other build system cleanups #1610

Merged

KerfuffleV2 closed this as completed in #1610 May 27, 2023

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why server is not provided with binaries ? #1578

Why server is not provided with binaries ? #1578

mirek190 commented May 23, 2023

BarfingLemurs commented May 24, 2023

noprotocolunit commented May 24, 2023 •

edited

Loading

maddes8cht commented May 24, 2023 •

edited

Loading

mirek190 commented May 24, 2023

FSSRepo commented May 25, 2023 •

edited

Loading

mirek190 commented May 25, 2023

jp-aa commented May 26, 2023

KerfuffleV2 commented May 26, 2023 •

edited

Loading

jp-aa commented May 26, 2023

SlyEcho commented May 26, 2023

Why server is not provided with binaries ? #1578

Why server is not provided with binaries ? #1578

Comments

mirek190 commented May 23, 2023

BarfingLemurs commented May 24, 2023

noprotocolunit commented May 24, 2023 • edited Loading

maddes8cht commented May 24, 2023 • edited Loading

mirek190 commented May 24, 2023

FSSRepo commented May 25, 2023 • edited Loading

mirek190 commented May 25, 2023

jp-aa commented May 26, 2023

KerfuffleV2 commented May 26, 2023 • edited Loading

jp-aa commented May 26, 2023

SlyEcho commented May 26, 2023

noprotocolunit commented May 24, 2023 •

edited

Loading

maddes8cht commented May 24, 2023 •

edited

Loading

FSSRepo commented May 25, 2023 •

edited

Loading

KerfuffleV2 commented May 26, 2023 •

edited

Loading