Tutorial: How to convert HuggingFace model to GGUF format [UPDATED] #7927
Replies: 5 comments
-
There is a problem……
|
Beta Was this translation helpful? Give feedback.
-
If the directory of the model before conversion contains config.json for hugging face, can I use it in the model after conversion to gguf format? |
Beta Was this translation helpful? Give feedback.
-
Also, how can we convert mistral model by using convert_hf_to_gguf.py? |
Beta Was this translation helpful? Give feedback.
-
Hey Guys, I am trying to convert increased context Llama model from https://huggingface.co/togethercomputer/LLaMA-2-7B-32K to gguf. When I ran the above instructions, I got this error. Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
bro this script it's driving me crazy it was so easy to convert to gguf a year back python convert_hf_to_gguf.py llama-3-1-8b-samanta-spectrum --outfile neural-samanta-spectrum.gguf --outtype f16 |
Beta Was this translation helpful? Give feedback.
-
I wanted to make this Tutorial because of the latest changes made these last few days in this PR that changes the way you have to tackle the convertion.
Download the Hugging Face model
Source: https://www.substratus.ai/blog/converting-hf-model-gguf-model/
This haven't been changed so you can still use the old method, here is a link for how to do this part.
For this exemple I will be using the Bloom 3b model
Convert the model
Here is where things changed quit a bit from the last Tutorial.
llama.cpp comes with a script that does the GGUF convertion from either a GGML model or an hf model (HuggingFace model).
First start by cloning the repository :
Install the Python Libraries :
Important : if the install works just fine then that's good but if you face some problems maybe try changing the
numpy
package version inrequirements-convert-legacy-llama.txt
fromnumpy~=1.24.4
tonumpy~=1.26.4
. And if you get another error saying it can't download the2.1.1
version oftorch
then changetorch~=2.1.1
totorch~=2.2.1
in bothrequirements-convert-hf-to-gguf-update.txt
andrequirements-convert-hf-to-gguf.txt
. These files can be found in therequirements
folder.Now go to the
convert_hf_to_gguf_update.py
file and add your model to the models array, you will find this last one at around line 64 :In this same file make sur that the function calls
convert_py_pth.read_text()
andconvert_py_pth.write_text(convert_py)
at around line 217 have the parameterencoding
set toutf-8
:Remark : for some people this won't change anything but for others they will face problems later on if this is not set
Make sure that you have already executed this command before doing the next step
Now execute the command shown at the start of the
convert_hf_to_gguf_update.py
file :Remark : replace
<huggingface_token>
with your actuelle huggingface account token. here is how to do it if you still don't have one.Finally you can run this command to create your
.gguf
file :llama.cpp/convert-hf-to-gguf.py : the path to
the convert-hf-to-gguf.py
file. relative to the current directory of the terminalBloom-3b : path to the HF model folder. relative to the current directory of the terminal
--outfile Bloom-3b.gguf : the ouput file, it need to have the
.gguf
extension at the end--outtype q8_0 : the quantization method
Go to the output directory to see if the
.gguf
file was created.IMPORTANT : in case the downloaded model doesn't have the
config.json
file in it you will probably get an error saying that it can't find it. if the model is aLlama
model then you can use the same command above but replacellama.cpp/convert-hf-to-gguf.py
withllama.cpp/examples/convert-legacy-llama.py
instead and hopefully it should work.If you get an
open raise BadZipFile(f"Overlapped entries: {zinfo.orig_filename!r} (possible zip bomb)")
error try downgrading frompython 3.12
topython 3.10
Beta Was this translation helpful? Give feedback.
All reactions