-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Latest release crashes on start #903
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
can confirm, i get the exact same error. Rolling back to linked release works |
Same issue, introduced in #709 static const char *llama_ftype_name(enum llama_ftype ftype) {
switch (ftype) {
case LLAMA_FTYPE_ALL_F32: return "all F32";
case LLAMA_FTYPE_MOSTLY_F16: return "mostly F16";
case LLAMA_FTYPE_MOSTLY_Q4_0: return "mostly Q4_0";
case LLAMA_FTYPE_MOSTLY_Q4_1: return "mostly Q4_1";
default: LLAMA_ASSERT(false);
}
} ftype for my q4_1 model is 4 when this function is called. This is a gptq model converted to q4_1, and interestingly, the convert-gptq-to-ggml.py script does do Ah, so #801 removed checking for GPTQ models For the actual fix, I guess another llama_ftype could be added? temp fix for anyone waitingstatic const char *llama_ftype_name(enum llama_ftype ftype) {
switch (ftype) {
case LLAMA_FTYPE_ALL_F32: return "all F32";
case LLAMA_FTYPE_MOSTLY_F16: return "mostly F16";
case LLAMA_FTYPE_MOSTLY_Q4_0: return "mostly Q4_0";
case LLAMA_FTYPE_MOSTLY_Q4_1: return "mostly Q4_1";
+ case 4: return "mostly Q4_1 and some f16";
default: LLAMA_ASSERT(false);
}
} There is no negative effect from just bypassing this assertion, the f16/ftype hparam isn’t used anymore. |
Yes I am having this issue as well, with GPTQ models |
If you comment out |
For now I just rolled back to the commit before with:
|
My apologies, I assumed that the "4" format was no longer supported by the new loader code in #801, that's why I didn't make a value in |
I'm experiecing this error. Anyone knows what's the issue? I think this is a bug, since the one of the previous releases that doesn't have this problem is master-2663d2c.
The text was updated successfully, but these errors were encountered: