llama : expose model's rope_freq_scale in the API #3418

grencez · 2023-09-30T21:43:14Z

I think this is necessary for automatic implementations of https://github.com/ggerganov/llama.cpp/tree/master/examples/main#extended-context-size when the model's RoPE scaling factor isn't 1.0. (We want to further scale it rather than overwriting the value, right?)

so it can be scaled further before creating a context.

ggml-org/llama.cpp#3418

…example * 'master' of github.com:ggerganov/llama.cpp: (24 commits) convert : fix Baichuan2 models by using vocab size in config.json (ggml-org#3299) readme : add project status link ggml : fix build after ggml-org#3329 llm : add Refact model (ggml-org#3329) sync : ggml (conv 1d + 2d updates, UB fixes) (ggml-org#3468) finetune : readme fix typo (ggml-org#3465) ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggml-org#3453) main : consistent prefix/suffix coloring (ggml-org#3425) llama : fix session saving/loading (ggml-org#3400) llama : expose model's rope_freq_scale in the API (ggml-org#3418) metal : alibi for arbitrary number of heads (ggml-org#3426) cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggml-org#3273) Work on the BPE tokenizer (ggml-org#3252) convert : fix vocab size when not defined in hparams (ggml-org#3421) cmake : increase minimum version for add_link_options (ggml-org#3444) CLBlast: Add broadcast support for matrix multiplication (ggml-org#3402) gguf : add BERT, MPT, and GPT-J arch info (ggml-org#3408) gguf : general usability improvements (ggml-org#3409) cmake : make CUDA flags more similar to the Makefile (ggml-org#3420) finetune : fix ggml-org#3404 (ggml-org#3437) ...

so it can be scaled further before creating a context.

ggml-org/llama.cpp#3418

grencez force-pushed the model_rope branch from 567051f to 93b8765 Compare October 2, 2023 08:15

llama : expose model's rope_freq_scale in the API

bb941fc

so it can be scaled further before creating a context.

grencez force-pushed the model_rope branch from 93b8765 to bb941fc Compare October 2, 2023 11:02

ggerganov approved these changes Oct 3, 2023

View reviewed changes

ggerganov merged commit 48be797 into ggml-org:master Oct 3, 2023

grencez deleted the model_rope branch October 3, 2023 18:20

grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 4, 2023

updatg(llama.cpp): with rope_freq_scale in the API

8fd5043

ggml-org/llama.cpp#3418

yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023

llama : expose model's rope_freq_scale in the API (ggml-org#3418)

6487ea7

so it can be scaled further before creating a context.

grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 12, 2023

update(llama.cpp): with rope_freq_scale in the API

f96fa97

ggml-org/llama.cpp#3418

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : expose model's rope_freq_scale in the API #3418

llama : expose model's rope_freq_scale in the API #3418

grencez commented Sep 30, 2023

llama : expose model's rope_freq_scale in the API #3418

llama : expose model's rope_freq_scale in the API #3418

Conversation

grencez commented Sep 30, 2023