Skip to content

Does this support safetensors files? #1586

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
guranu opened this issue May 24, 2023 · 3 comments
Closed

Does this support safetensors files? #1586

guranu opened this issue May 24, 2023 · 3 comments

Comments

@guranu
Copy link

guranu commented May 24, 2023

Hello, so i want to install wizardlm 7b uncensored but the model is a safetensors file so i was wondering does this support .safetensors files?

@guranu guranu changed the title Does this safetensors files? Does this support safetensors files? May 24, 2023
@EliEron
Copy link

EliEron commented May 24, 2023

No not directly. Safetensor files are usually given for GPTQ models which are models designed to run on GPTQ-for-LLaMa or AutoGPTQ, not on llama.cpp.

For llama.cpp you want GGML files, practically all models that are offered in GPTQ form is also offered in GGML somewhere else. For WizardLM-7B Uncensored you can find GGML files here. You only need one of the files. The lower the Q after the file name the less accurate they are but the less resources they need. So Q4_0 is the easiest to run and Q8_0 the hardest.

@CRD716
Copy link
Contributor

CRD716 commented May 24, 2023

Yes, it does. You need to run the convert.py script on the files to turn it into a valid ggml.

@guranu
Copy link
Author

guranu commented May 25, 2023

No not directly. Safetensor files are usually given for GPTQ models which are models designed to run on GPTQ-for-LLaMa or AutoGPTQ, not on llama.cpp.

For llama.cpp you want GGML files, practically all models that are offered in GPTQ form is also offered in GGML somewhere else. For WizardLM-7B Uncensored you can find GGML files here. You only need one of the files. The lower the Q after the file name the less accurate they are but the less resources they need. So Q4_0 is the easiest to run and Q8_0 the hardest.

I see thanks for the help.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants