Can't convert 4-bit safetensors file #1347

trollkotze · 2023-05-07T00:45:33Z

Prerequisites

I downloaded the 4bit.safetensors and all the .json files and tokenizer model from this HuggingFace repo all into the same directory:
https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b/tree/main
Then first I tried to convert it with the convert.py script to ggml in q4_0 quantization and got this error message:

python3 convert.py /path/to/4bit.safetensors --outtype q4_0 --outfile /path/to/alpacadente.bin

Loading model file /path/to/4bit.safetensors
Loading vocab file /path/to/tokenizer.model
Traceback (most recent call last):
  File "convert.py", line 1165, in <module>
    main()
  File "convert.py", line 1157, in main
    model = convert_to_output_type(model, output_type)
  File "convert.py", line 1007, in convert_to_output_type
    return {name: tensor.astype(output_type.type_for_tensor(name, tensor))
  File "convert.py", line 1007, in <dictcomp>
    return {name: tensor.astype(output_type.type_for_tensor(name, tensor))
  File "convert.py", line 503, in astype
    self.validate_conversion_to(data_type)
  File "convert.py", line 514, in validate_conversion_to
    raise Exception(f"Can't turn an unquantized tensor into a quantized type ({data_type})")
Exception: Can't turn an unquantized tensor into a quantized type (QuantizedDataType(groupsize=32, have_addends=False, have_g_idx=False))

But this is a 4-bit-quantized safetensors file. So why does the script claim it's unquantized and refuse to convert it to quantized ggml?

Next I tried it without specifying output quantization and got a different error about a supposed vocab size mismatch:

python3 convert.py /path/to/4bit.safetensors --outfile /modelspace/alpacadente.bin
Loading model file /path/to/4bit.safetensors
Loading vocab file /path/to/tokenizer.model
Traceback (most recent call last):
  File "convert.py", line 1165, in <module>
    main()
  File "convert.py", line 1160, in main
    OutputFile.write_all(outfile, params, model, vocab)
  File "convert.py", line 958, in write_all
    check_vocab_size(params, vocab)
  File "convert.py", line 912, in check_vocab_size
    raise Exception(msg)
Exception: Vocab size mismatch (model has 32016, but /modelspace/alpastadente/tokenizer.model combined with /modelspace/alpastadente/added_tokens.json has 32005).

No idea how to deal with that. 11 out of 32016 tokens missing? I guess this one is less likely a problem with llama.cpp's script, but maybe it's rather a problem with the files in the HF repo, but might there be a way to tweak and fix something like this if it's just 11 tokens I have to put in somewhere?

Expected Behavior

Should be able to convert 4-bit safetensors to q4_0 ggml. And to q4_1, ... q4_3, q5_0, q5_1 as well would be cool.

Current Behavior

Throws those errors as described above.

Environment and Context

CPU: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
RAM: 64 GB
OS: Linux Mint 20 or 21 or something, kernel 5.4.0-125-generic

$ python3 --version
Python 3.8.10
$ make --version
GNU Make 4.2.1
$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

The text was updated successfully, but these errors were encountered:

Green-Sky · 2023-05-07T14:21:31Z

the gqtp 4bit quantization the 4bit.safetensor uses is only accidentally compatible, in some ways.
Please use the full pytorch model instead, it will result in better quality model files.

trollkotze · 2023-05-10T01:11:01Z

@Green-Sky Okay. Thanks for the info. I guess I can close this issue then.

trollkotze closed this as completed May 10, 2023

Freed-Wu mentioned this issue Jul 13, 2023

[bug] Cannot quantize #2211

Closed

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't convert 4-bit safetensors file #1347

Can't convert 4-bit safetensors file #1347

trollkotze commented May 7, 2023

Green-Sky commented May 7, 2023

trollkotze commented May 10, 2023

Can't convert 4-bit safetensors file #1347

Can't convert 4-bit safetensors file #1347

Comments

trollkotze commented May 7, 2023

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Green-Sky commented May 7, 2023

trollkotze commented May 10, 2023