Skip to content

Synchronize LLAMA_API with ggml-org/llama.cpp and update cuda workflow for windows #1966

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 0 commits into from

Conversation

JamePeng
Copy link

@JamePeng JamePeng commented Mar 9, 2025

Update llama.cpp version llama.cpp updated [from 794fe2 to f08f4b3]
Use the llama_sampler_init instead of llama_sampler() for safe usage
Sync llama : add Phi-4-mini support
Sync llama : expose llama_model_n_head_kv in the API
Sync tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars
class LlamaSampler: append add_xtc(), add_top_n_sigma() and add_dry()
Remove Tail-Free sampling
Add TopN-Sigma/XTC/DRY samplers code into sampler
Sync llama : Add Gemma 3 support

@JamePeng JamePeng changed the title Sync LLAMA_API names with ggml-org/llama.cpp 20250309, support LLAMA_VOCAB_PRE_TYPE_GPT4O Sync LLAMA_API names with ggml-org/llama.cpp 20250309 Mar 9, 2025
@JamePeng
Copy link
Author

JamePeng commented Mar 9, 2025

I tried to adjust the workflow output based on VS2022 to compile pip wheels, and generate two cuda versions 12.4.1 and 12.6.3 and the win version of py310-312 for your convenience.
It should have been compiled now: https://github.com/JamePeng/llama-cpp-python/releases

@JamePeng JamePeng changed the title Sync LLAMA_API names with ggml-org/llama.cpp 20250309 Synchronize LLAMA_API with ggml-org/llama.cpp and update cuda workflow for windows Mar 9, 2025
@JamePeng
Copy link
Author

JamePeng commented Mar 13, 2025

llama.cpp : refactor llama_context, llama_kv_cache, llm_build_context (ggml-org/llama.cpp#12181)
They change API name again, :<

@JamePeng
Copy link
Author

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant