Result of merging 2 Gemma2 9B models gains 1B parameters somehow #385

jim-plus · 2024-07-28T03:42:41Z

Resulting model weights and SLERP merge formula here:
https://huggingface.co/grimjim/Gemma2-Nephilim-v3-9B

An exl2 quant of the above works, but where did the extra 1B parameters come from?

ALucek · 2024-08-08T20:57:41Z

https://huggingface.co/AdamLucek/gemma2-2b-it-chinese-german

Also found this to happen with model stock and gemma2 2b

jim-plus · 2024-08-08T21:40:30Z

In the case of 9b, the fault appears to reside in the first safetensors chunk. There's a spurious lm_head.weight tensor that should be removed from that as well as model.safetensors.index.json, and after that the model size is what it should be.

ALucek · 2024-08-08T21:57:47Z

Beat me to it, same thing is happening here with lm_head.weight for the 2b model

looks like its likely something related to handling the tokenizer source

h-lunah · 2024-08-09T10:58:09Z

and how can the duplicate lm_head.weight be removed so I can merge uncensored models for max uncensorship?

ALucek · 2024-08-09T13:41:50Z

@piotr25691 Remove the entry for it from your index.json using whatever code editor, and then for the model itself you can directly edit the file with safetensors package. Here's a simplified script that will do it for you

from safetensors import safe_open
from safetensors.torch import save_file
import torch

# Path to your SafeTensors file
input_file = "path/to/your/model-00001-of-00002.safetensors"
output_file = "path/to/your/fixed-model-00001-of-00002.safetensors"

# Load the SafeTensors file
tensors = {}
with safe_open(input_file, framework="pt", device="cpu") as f:
    for key in f.keys():
        if key != "lm_head.weight":
            tensors[key] = f.get_tensor(key)

# Save the modified tensors
save_file(tensors, output_file)

print(f"SafeTensors file without lm_head saved to {output_file}")

# Optionally, verify the removal
with safe_open(output_file, framework="pt", device="cpu") as f:
    if "lm_head.weight" not in f.keys():
        print("lm_head.weight successfully removed")
    else:
        print("Warning: lm_head.weight still present")

jukofyork · 2024-08-18T21:25:49Z

It's because the (transpose of?) lm_head is used as embedding weights too:

ggml-org/llama.cpp#9065

IIRC, the command-r models also reuses the lm_head like this too.

cg123 mentioned this issue Aug 23, 2024

Set Gemma2 lm_head optional instead of aliasing to embed_tokens #406

Merged

cg123 closed this as completed in #406 Aug 23, 2024

cg123 closed this as completed in 36738ff Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Result of merging 2 Gemma2 9B models gains 1B parameters somehow #385

Result of merging 2 Gemma2 9B models gains 1B parameters somehow #385

jim-plus commented Jul 28, 2024

ALucek commented Aug 8, 2024

jim-plus commented Aug 8, 2024

ALucek commented Aug 8, 2024

h-lunah commented Aug 9, 2024

ALucek commented Aug 9, 2024

jukofyork commented Aug 18, 2024

Result of merging 2 Gemma2 9B models gains 1B parameters somehow #385

Result of merging 2 Gemma2 9B models gains 1B parameters somehow #385

Comments

jim-plus commented Jul 28, 2024

ALucek commented Aug 8, 2024

jim-plus commented Aug 8, 2024

ALucek commented Aug 8, 2024

h-lunah commented Aug 9, 2024

ALucek commented Aug 9, 2024

jukofyork commented Aug 18, 2024