Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

llama : expose model's rope_freq_scale in the API #3418

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

grencez
Copy link
Contributor

@grencez grencez commented Sep 30, 2023

I think this is necessary for automatic implementations of https://github.com/ggerganov/llama.cpp/tree/master/examples/main#extended-context-size when the model's RoPE scaling factor isn't 1.0. (We want to further scale it rather than overwriting the value, right?)

so it can be scaled further before creating a context.
@ggerganov ggerganov merged commit 48be797 into ggml-org:master Oct 3, 2023
@grencez grencez deleted the model_rope branch October 3, 2023 18:20
grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 4, 2023
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 5, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp: (24 commits)
  convert : fix Baichuan2 models by using vocab size in config.json (ggml-org#3299)
  readme : add project status link
  ggml : fix build after ggml-org#3329
  llm : add Refact model (ggml-org#3329)
  sync : ggml (conv 1d + 2d updates, UB fixes) (ggml-org#3468)
  finetune : readme fix typo (ggml-org#3465)
  ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggml-org#3453)
  main : consistent prefix/suffix coloring (ggml-org#3425)
  llama : fix session saving/loading (ggml-org#3400)
  llama : expose model's rope_freq_scale in the API (ggml-org#3418)
  metal : alibi for arbitrary number of heads (ggml-org#3426)
  cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggml-org#3273)
  Work on the BPE tokenizer (ggml-org#3252)
  convert : fix vocab size when not defined in hparams (ggml-org#3421)
  cmake : increase minimum version for add_link_options (ggml-org#3444)
  CLBlast: Add broadcast support for matrix multiplication (ggml-org#3402)
  gguf : add BERT, MPT, and GPT-J arch info (ggml-org#3408)
  gguf : general usability improvements (ggml-org#3409)
  cmake : make CUDA flags more similar to the Makefile (ggml-org#3420)
  finetune : fix ggml-org#3404 (ggml-org#3437)
  ...
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
so it can be scaled further before creating a context.
grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 12, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants