Skip to content

Why server is not provided with binaries ? #1578

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
mirek190 opened this issue May 23, 2023 · 10 comments · Fixed by #1610
Closed

Why server is not provided with binaries ? #1578

mirek190 opened this issue May 23, 2023 · 10 comments · Fixed by #1610

Comments

@mirek190
Copy link

Question ... why server is not provided with binaries ?

@BarfingLemurs
Copy link
Contributor

there are: https://github.com/ggerganov/llama.cpp/releases/tag/master-7d87381

but you need to build the arm versions yourself though.

@noprotocolunit
Copy link

noprotocolunit commented May 24, 2023

Perhaps I'm missing something, but I downloaded llama-master-7d87381-bin-win-openblas-x64.zip and the server executable is not among them. Am I doing it wrong?

@maddes8cht
Copy link
Contributor

maddes8cht commented May 24, 2023

Still there is no server.exe in none of the provided binaries, in none of the latest releases.

@mirek190
Copy link
Author

Is not present . That's why I'm asking l.
I can build by myself but still ...

@FSSRepo
Copy link
Collaborator

FSSRepo commented May 25, 2023

The LLAMA_BUILD_SERVER option is OFF by default.

@mirek190
Copy link
Author

..yes.. but why?

@jp-aa
Copy link

jp-aa commented May 26, 2023

I don't mean to be annoying, but the server doesn't get built by default when using make, nor appears on the Makefile either ( i'm not really good with build systems and stuff). It does seem to be included when building with cmake, but passing the NVCC flags I need to cmake isn't working for me. Is there a reason why it's not getting built by default? Or at least included in the Makefile?

@KerfuffleV2
Copy link
Collaborator

KerfuffleV2 commented May 26, 2023

Looks like you can only build the server via cmake.

From poking around a bit, it's actually quite large compared to all the other examples and includes a copy of a HTTP and JSON parsing library. I'd guess it's disabled by default because it's relatively slow to compile.

Just for comparison, the main example (basically what people call llama.cpp) is around 25k with about another 33k of common code which is shared with all the other examples. For the server, the JSON parsing and HTTP support come to about 1MB of source code.

This answer doesn't necessarily help someone that just wants that feature but hopefully at least makes it seem like less of an arbitrary decision.

@jp-aa
Copy link

jp-aa commented May 26, 2023

Looks like you can only build the server via cmake.

From poking around a bit, it's actually quite large compared to all the other examples and includes a copy of a HTTP and JSON parsing library. I'd guess it's disabled by default because it's relatively slow to compile.

Just for comparison, the main example (basically what people call llama.cpp) is around 25k with about another 33k of common code which is shared with all the other examples. For the server, the JSON parsing and HTTP support come to about 1MB of source code.

This answer doesn't necessarily help someone that just wants that feature but hopefully at least makes it seem like less of an arbitrary decision.

Well it's not that bad. I managed to compile the server with make just by adding "server" to the default target, clean target, and an entry in the examples section of the Makefile which looks the same as most of the stuff in there.
My only problem now is offloading model layers to the GPU while running the server binary. I have NO IDEA if it's even a feature yet. If someone knows how please tell me

@SlyEcho
Copy link
Collaborator

SlyEcho commented May 26, 2023

It needs to be added to the workflow, nothing special.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants