llama : fix loading models with shared tok_embd and output #5651

slaren · 2024-02-21T23:14:04Z

No description provided.

ggml-ci

cebtenzzre

Seems to work on CPU, CUDA, and Kompute.

…5651) ggml-ci

llama : fix loading models with shared tok_embd and output

77370b3

ggml-ci

slaren mentioned this pull request Feb 21, 2024

gemma : allow offloading the output tensor #5646

Merged

cebtenzzre approved these changes Feb 21, 2024

View reviewed changes

slaren merged commit 973053d into master Feb 21, 2024
46 of 62 checks passed

slaren deleted the sl/fix-extra-tensor branch February 21, 2024 23:42

hannibalhuang mentioned this pull request Feb 22, 2024

gemma : use more bits for the token_embd.weight tensor #5650

Merged

qnixsynapse mentioned this pull request Feb 22, 2024

Need support for GemmaForCausalLM #5635

Closed

4 tasks

cebtenzzre pushed a commit to nomic-ai/llama.cpp that referenced this pull request Feb 22, 2024

llama : fix loading models with shared tok_embd and output (ggml-org#…

12c7910

…5651) ggml-ci

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

llama : fix loading models with shared tok_embd and output (ggml-org#…

148ef0c

…5651) ggml-ci

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

llama : fix loading models with shared tok_embd and output (ggml-org#…

6a18d03

…5651) ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : fix loading models with shared tok_embd and output #5651

llama : fix loading models with shared tok_embd and output #5651

slaren commented Feb 21, 2024

cebtenzzre left a comment

llama : fix loading models with shared tok_embd and output #5651

llama : fix loading models with shared tok_embd and output #5651

Conversation

slaren commented Feb 21, 2024

cebtenzzre left a comment

Choose a reason for hiding this comment