Error: Invalid model file when using converted GPT4ALL model after following provided instructions #655

gaceladri · 2023-03-31T17:13:52Z

Hello,

I have followed the instructions provided for using the GPT-4ALL model. I used the convert-gpt4all-to-ggml.py script to convert the gpt4all-lora-quantized.bin model, as instructed. However, I encountered an error related to an invalid model file when running the example.

Here are the steps I followed, as described in the instructions:

Convert the model using the convert-gpt4all-to-ggml.py script:

python3 convert-gpt4all-to-ggml.py models/gpt4all/gpt4all-lora-quantized.bin ./models/tokenizer.model

Run the interactive mode example with the newly generated gpt4all-lora-quantized.bin model:

./main -m ./models/gpt4all/gpt4all-lora-quantized.bin -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

However, I encountered the following error:

./models/gpt4all/gpt4all-lora-quantized.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])
you most likely need to regenerate your ggml files
the benefit is you'll get 10-100x faster load times
see https://github.com/ggerganov/llama.cpp/issues/91
use convert-pth-to-ggml.py to regenerate from original pth
use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model
main: error: failed to load model './models/gpt4all/gpt4all-lora-quantized.bin'

Please let me know how to resolve this issue and correctly convert and use the GPT-4ALL model with the interactive mode example.

Thank you.

The text was updated successfully, but these errors were encountered:

gaceladri · 2023-03-31T17:32:25Z

I could run it with the previous version https://github.com/ggerganov/llama.cpp/tree/master-ed3c680

DonIsaac · 2023-03-31T17:45:04Z

I could run it with the previous version https://github.com/ggerganov/llama.cpp/tree/master-ed3c680

After building from this tag, I'm getting a segfault. What OS are you using?

Using Macos 13.2 on an M1 chip
commit: ed3c680bcd0e8ce6e574573ba95880b694449878
output after running ./main -m g4a/gpt4all-lora-quantized.bin -p "hi there" -n 512:

main: seed = 1680284326
llama_model_load: loading model from 'g4a/gpt4all-lora-quantized.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml ctx size = 4273.35 MB
llama_model_load: mem required  = 6065.35 MB (+ 1026.00 MB per state)
llama_model_load: loading model part 1/1 from 'g4a/gpt4all-lora-quantized.bin'
llama_model_load: [1]    28303 segmentation fault  ./main -m g4a/gpt4all-lora-quantized.bin -p "hi there" -n 512

rabidcopy · 2023-03-31T17:49:13Z

use migrate-ggml-2023-03-30-pr613.py

gaceladri · 2023-03-31T17:55:16Z

I solved the issue by running the command:

python migrate-ggml-2023-03-30-pr613.py models/gpt4all/gpt4all-lora-quantized.bin models/gpt4all/gpt4all-lora-converted.bin

after executing the:

python3 convert-gpt4all-to-ggml.py models/gpt4all-lora-quantized.bin ./models/tokenizer.model

and now i'm interacting with gpt4all with:

./main -m ./models/gpt4all/gpt4all-lora-converted.bin -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

scottjmaddox · 2023-04-01T05:52:34Z

Would it be worth updating the README section with this information?

ROBOKiTTY · 2023-04-01T06:23:29Z

After running convert-gpt4all-to-ggml.py and migrate-ggml-2023-03-30-pr613.py, main segfaults with a failed ggml assertion.

1GGML_ASSERT: H:\llama.cpp\ggml.c:3192: ((uintptr_t) (result->data))%GGML_MEM_ALIGN == 0

Full logs:

H:\llama.cpp\bin>main -m models/gpt4all-lora-quantized-v2.bin -n 248
main: seed = 1680331950
llama_model_load: loading model from 'models/gpt4all-lora-quantized-v2.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 4017.70 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 5809.78 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from 'models/gpt4all-lora-quantized-v2.bin'
llama_model_load: model size =  4017.27 MB / num tensors = 291
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 4 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
sampling: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.100000
generate: n_ctx = 512, n_batch = 8, n_predict = 248, n_keep = 0


 5GGML_ASSERT: H:\llama.cpp\ggml.c:3192: ((uintptr_t) (result->data))%GGML_MEM_ALIGN == 0

BoQsc · 2023-04-01T18:14:17Z

These are all steps that I did:

Torrent download gpt4all-lora-quantized.bin from https://github.com/nomic-ai/gpt4all#try-it-yourself
python -m pip install torch numpy sentencepiece
https://huggingface.co/decapoda-research/llama-7b-hf/blob/main/tokenizer.model

python convert-gpt4all-to-ggml.py ./models/gpt4all-7B/gpt4all-lora-quantized.bin ./models/tokenizer.model 

python migrate-ggml-2023-03-30-pr613.py models/gpt4all/gpt4all-lora-quantized.bin models/gpt4all/gpt4all-lora-converted.bin.orig

main -m ./llama.cpp/models/gpt4all/gpt4all-lora-converted.bin.orig -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

However it is writing nonsenses and does not let to interact with interactive mode. Maybe something is wrong.

ROBOKiTTY · 2023-04-01T20:46:06Z

After running convert-gpt4all-to-ggml.py and migrate-ggml-2023-03-30-pr613.py, main segfaults with a failed ggml assertion.

1GGML_ASSERT: H:\llama.cpp\ggml.c:3192: ((uintptr_t) (result->data))%GGML_MEM_ALIGN == 0

I commented out this line in ggml.c and recompiled to see what would happen, and it just worked. That was unexpected, but I won't complain.

clxyder · 2023-04-02T04:43:29Z

These are all steps that I did:

Torrent download gpt4all-lora-quantized.bin from https://github.com/nomic-ai/gpt4all#try-it-yourself

python -m pip install torch numpy sentencepiece

https://huggingface.co/decapoda-research/llama-7b-hf/blob/main/tokenizer.model
python convert-gpt4all-to-ggml.py ./models/gpt4all-7B/gpt4all-lora-quantized.bin ./models/tokenizer.model 

python migrate-ggml-2023-03-30-pr613.py models/gpt4all/gpt4all-lora-quantized.bin models/gpt4all/gpt4all-lora-converted.bin.orig

main -m ./llama.cpp/models/gpt4all/gpt4all-lora-converted.bin.orig -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
However it is writing nonsenses and does not let to interact with interactive mode. Maybe something is wrong.

Can anyone confirm if decapoda-research/llama-7b-hf's tokenizer.model is adequate to use in this case?

ggerganov · 2023-04-02T07:54:54Z

After running convert-gpt4all-to-ggml.py and migrate-ggml-2023-03-30-pr613.py, main segfaults with a failed ggml assertion.
1GGML_ASSERT: H:\llama.cpp\ggml.c:3192: ((uintptr_t) (result->data))%GGML_MEM_ALIGN == 0

I commented out this line in ggml.c and recompiled to see what would happen, and it just worked. That was unexpected, but I won't complain.

This is strange. It's expected that it works after commenting this line since we don't really need the buffer to be aligned, but I wonder why it is not the case anymore. Seems to be related to the mmap change.

d0rc · 2023-06-02T11:35:35Z

It happened to me when trying to use --prompt-cache on a custom model

gaceladri changed the title ~~Error: Invalid model file when using converted GPT-4ALL model after following provided instructions~~ Error: Invalid model file when using converted GPT4ALL model after following provided instructions Mar 31, 2023

gaceladri mentioned this issue Mar 31, 2023

Make loading weights 10-100x faster #613

Merged

gaceladri closed this as completed Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Invalid model file when using converted GPT4ALL model after following provided instructions #655

Error: Invalid model file when using converted GPT4ALL model after following provided instructions #655

gaceladri commented Mar 31, 2023

gaceladri commented Mar 31, 2023

DonIsaac commented Mar 31, 2023

rabidcopy commented Mar 31, 2023

gaceladri commented Mar 31, 2023

scottjmaddox commented Apr 1, 2023

ROBOKiTTY commented Apr 1, 2023 •

edited

Loading

BoQsc commented Apr 1, 2023 •

edited

Loading

ROBOKiTTY commented Apr 1, 2023

clxyder commented Apr 2, 2023

ggerganov commented Apr 2, 2023

d0rc commented Jun 2, 2023

Error: Invalid model file when using converted GPT4ALL model after following provided instructions #655

Error: Invalid model file when using converted GPT4ALL model after following provided instructions #655

Comments

gaceladri commented Mar 31, 2023

gaceladri commented Mar 31, 2023

DonIsaac commented Mar 31, 2023

rabidcopy commented Mar 31, 2023

gaceladri commented Mar 31, 2023

scottjmaddox commented Apr 1, 2023

ROBOKiTTY commented Apr 1, 2023 • edited Loading

BoQsc commented Apr 1, 2023 • edited Loading

ROBOKiTTY commented Apr 1, 2023

clxyder commented Apr 2, 2023

ggerganov commented Apr 2, 2023

d0rc commented Jun 2, 2023

ROBOKiTTY commented Apr 1, 2023 •

edited

Loading

BoQsc commented Apr 1, 2023 •

edited

Loading