Tutorial: How to convert HuggingFace model to GGUF format [UPDATED] #7927

hamdoudhakem · 2024-06-13T21:59:43Z

hamdoudhakem
Jun 13, 2024

I wanted to make this Tutorial because of the latest changes made these last few days in this PR that changes the way you have to tackle the convertion.

Download the Hugging Face model

Source: https://www.substratus.ai/blog/converting-hf-model-gguf-model/
This haven't been changed so you can still use the old method, here is a link for how to do this part.

For this exemple I will be using the Bloom 3b model

Convert the model

Here is where things changed quit a bit from the last Tutorial.
llama.cpp comes with a script that does the GGUF convertion from either a GGML model or an hf model (HuggingFace model).

First start by cloning the repository :

git clone https://github.com/ggerganov/llama.cpp.git

Install the Python Libraries :

pip install -r llama.cpp/requirements.txt

Important : if the install works just fine then that's good but if you face some problems maybe try changing the numpy package version in requirements-convert-legacy-llama.txt from numpy~=1.24.4 to numpy~=1.26.4. And if you get another error saying it can't download the 2.1.1 version of torch then change torch~=2.1.1 to torch~=2.2.1 in both requirements-convert-hf-to-gguf-update.txt and requirements-convert-hf-to-gguf.txt. These files can be found in the requirements folder.

Now go to the convert_hf_to_gguf_update.py file and add your model to the models array, you will find this last one at around line 64 :

models = [
    {"name": "llama-spm",      "tokt": TOKENIZER_TYPE.SPM, "repo": "https://huggingface.co/meta-llama/Llama-2-7b-hf", },
    {"name": "llama-bpe",      "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/meta-llama/Meta-Llama-3-8B", }, 
    ...    
    {"name": "bloom-3b",   "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/bigscience/bloom-3b", },
]

In this same file make sur that the function calls convert_py_pth.read_text() and convert_py_pth.write_text(convert_py) at around line 217 have the parameter encoding set to utf-8 :

convert_py_pth = pathlib.Path("convert-hf-to-gguf.py")
convert_py = convert_py_pth.read_text(encoding="utf-8")
...
convert_py_pth.write_text(convert_py, encoding="utf-8")

Remark : for some people this won't change anything but for others they will face problems later on if this is not set

Make sure that you have already executed this command before doing the next step

huggingface-cli login

Now execute the command shown at the start of the convert_hf_to_gguf_update.py file :

python convert-hf-to-gguf-update.py <huggingface_token>

Remark : replace <huggingface_token> with your actuelle huggingface account token. here is how to do it if you still don't have one.

Finally you can run this command to create your .gguf file :

python llama.cpp/convert-hf-to-gguf.py Bloom-3b   --outfile Bloom-3b.gguf   --outtype q8_0

llama.cpp/convert-hf-to-gguf.py : the path to the convert-hf-to-gguf.py file. relative to the current directory of the terminal
Bloom-3b : path to the HF model folder. relative to the current directory of the terminal
--outfile Bloom-3b.gguf : the ouput file, it need to have the .gguf extension at the end
--outtype q8_0 : the quantization method

Go to the output directory to see if the .gguf file was created.

IMPORTANT : in case the downloaded model doesn't have the config.json file in it you will probably get an error saying that it can't find it. if the model is a Llama model then you can use the same command above but replace llama.cpp/convert-hf-to-gguf.py with llama.cpp/examples/convert-legacy-llama.py instead and hopefully it should work.
If you get an open raise BadZipFile(f"Overlapped entries: {zinfo.orig_filename!r} (possible zip bomb)") error try downgrading from python 3.12 to python 3.10

shenjackyuanjie · 2024-06-16T12:15:07Z

shenjackyuanjie
Jun 16, 2024

There is a problem……
i tried
and this pop out

V:\githubs\llama2 [🐍 v3.8.18+(env)]
❯ python .\convert-hf-to-gguf.py Qwen2-7B-Instruct/ --outfile models/7B/qwen2-7b-instruct-fp16.gguf --help
Traceback (most recent call last):
  File ".\convert-hf-to-gguf.py", line 835, in <module>
    class OrionModel(Model):
  File ".\convert-hf-to-gguf.py", line 836, in OrionModel
    model_arch = gguf.MODEL_ARCH.ORION
  File "D:\APPS\CPython\Python38\lib\enum.py", line 384, in __getattr__
    raise AttributeError(name) from None
AttributeError: ORION

0 replies

soichisumi · 2024-08-28T13:27:32Z

soichisumi
Aug 28, 2024

If the directory of the model before conversion contains config.json for hugging face, can I use it in the model after conversion to gguf format?
If not, how do I create the config.json and other configuration files?

0 replies

soichisumi · 2024-08-29T07:59:07Z

soichisumi
Aug 29, 2024

Also, how can we convert mistral model by using convert_hf_to_gguf.py?

#9169

0 replies

asokans11 · 2024-09-30T17:10:27Z

asokans11
Sep 30, 2024

Hey Guys,

I am trying to convert increased context Llama model from https://huggingface.co/togethercomputer/LLaMA-2-7B-32K to gguf. When I ran the above instructions, I got this error.

Traceback (most recent call last):
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 4338, in
main()
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 4332, in main
model_instance.write()
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 426, in write
self.prepare_metadata(vocab_only=False)
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 419, in prepare_metadata
self.set_vocab()
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 1486, in set_vocab
self._set_vocab_sentencepiece()
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 709, in _set_vocab_sentencepiece
tokens, scores, toktypes = self._create_vocab_sentencepiece()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/llm/llama.cpp/convert_hf_to_gguf.py", line 778, in _create_vocab_sentencepiece
if toktypes[token_id] != SentencePieceTokenTypes.UNUSED:
~~~~~~~~^^^^^^^^^^
IndexError: list index out of range

0 replies

WasamiKirua · 2025-02-28T13:42:07Z

WasamiKirua
Feb 28, 2025

bro this script it's driving me crazy it was so easy to convert to gguf a year back

python convert_hf_to_gguf.py llama-3-1-8b-samanta-spectrum --outfile neural-samanta-spectrum.gguf --outtype f16
Traceback (most recent call last):
File "/Volumes/Data/llamacpp/convert_hf_to_gguf.py", line 2560, in
class PhiMoeModel(Phi3MiniModel):
File "/Volumes/Data/llamacpp/convert_hf_to_gguf.py", line 2561, in PhiMoeModel
model_arch = gguf.MODEL_ARCH.PHIMOE
^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'MODEL_ARCH' has no attribute 'PHIMOE'

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial: How to convert HuggingFace model to GGUF format [UPDATED] #7927

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Tutorial: How to convert HuggingFace model to GGUF format [UPDATED] #7927

hamdoudhakem Jun 13, 2024

Download the Hugging Face model

Convert the model

Replies: 5 comments

shenjackyuanjie Jun 16, 2024

soichisumi Aug 28, 2024

soichisumi Aug 29, 2024

asokans11 Sep 30, 2024

WasamiKirua Feb 28, 2025

hamdoudhakem
Jun 13, 2024

shenjackyuanjie
Jun 16, 2024

soichisumi
Aug 28, 2024

soichisumi
Aug 29, 2024

asokans11
Sep 30, 2024

WasamiKirua
Feb 28, 2025