Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

.dll .so generation #39

Open
vaiju1981 opened this issue Feb 27, 2025 · 1 comment
Open

.dll .so generation #39

vaiju1981 opened this issue Feb 27, 2025 · 1 comment

Comments

@vaiju1981
Copy link

Hi I am currently trying to integrate llama.cpp with Java ( https://github.com/vaiju1981/java-llama.cpp/tree/b4689 ) this allows llamacpp server to run inside java.

I was wondering if there a way I can use llama-box instead of llama.cpp ( mostly due to fact that it has vision support ). How would I go about if that is possible.

@thxCode
Copy link
Collaborator

thxCode commented Feb 27, 2025

llama-box is an application that uses llama.cpp, so both the binder and application stand at the same level, one for the language-specific interface, and the other for the HTTP interface.

you can treat llama-box like Jetty or Tomcat in the Java domain.

I believe the binder can also implement the same VL logic, all you need is to get the right batch during llama_decode:

llama-box/llama-box/server.cpp

Lines 3868 to 3869 in 384ca12

qwen2vl_text_token_batch_wrapper batch_txt = qwen2vl_text_token_batch_wrapper((tokens.data() + j), n_eval, batch_txt_mrope_pos.data(), slot.id);
if (llama_decode(llm_ctx, batch_txt.batch)) {
.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants