-
Notifications
You must be signed in to change notification settings - Fork 12.8k
Fix bug in main.cpp (penalize_nl=false doesn't work). Supress warning on mingw. #1528
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
…s the underlying logits array, but at this point we are already working on the candidates copy.
…on, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45.
So many little mingw oddities cropping up. I'm curious though, any idea what's including |
examples/main/main.cpp
Outdated
candidates_p.data[idx].logit = nl_logit; | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the order of candidates_p
does not seem to be changed.
Why do you think it has?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that the order of candiates_p
has changed, although IMO it's not great to rely on the fact that llama_sample_repetition_penalty
and llama_sample_frequency_and_presence_penalties
preserve the order (without at least documenting that). The main issue is that on line 418 we've copied the logits into local candidates vector (of llama_token_data_array, not raw floats). So modifying the original logit array from the context does nothing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tom7 Apologise for the delay and not paying more attention to this - I think you are right and we've had this bug for quite some time now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see bug here. Order of logits is not changed after penalization
Thanks for the quick look!
Apparently it's via
|
It looks like this PR didn't gain much traction, but I think I just discovered the same bug that this PR fixes. As I understand it the current situation is:
However, as far as I can see This PR changes it to find the nl token in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martindevans Thanks for bringing attention again to this
@tom7 Sorry for the delay of about ~3 months 😄
* master: (773 commits) server : add `/detokenize` endpoint (ggml-org#2802) convert.py : advanced option (ggml-org#2753) llama : use Unicode Escape Sequence to replace encoded characters (ggml-org#2814) flake.nix : add rocm support and cleanup (ggml-org#2808) llama : move #includes out of _GNU_SOURCE conditional (ggml-org#2817) main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggml-org#1528) llama : use std::abs in llama_sample_tail_free (ggml-org#2800) k-quants : remove unnecessary tensor shape restrictions (ggml-org#2811) Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggml-org#2807) Fix HellaSwag (ggml-org#2805) flake : build llama.cpp on Intel with nix (ggml-org#2795) Handle null rope scaling value (ggml-org#2793) Fix spm whitespaces (ggml-org#2806) examples : skip unnecessary external lib in server README.md how-to (ggml-org#2804) llama : fix struct decl (ggml-org#2790) Faster perplexity computation (ggml-org#2786) llama : add llama_beam_search() (ggml-org#2267) convert.py : Get rope scale from HuggingFace models (ggml-org#2772) llama-bench : add model sizes (ggml-org#2771) convert.py : export rope freq_base when converting CodeLlama from an HF model (ggml-org#2773) ...
… mingw (ggml-org#1528) * Fix bug in main.cpp where penalize_nl=false has no effect. It modifies the underlying logits array, but at this point we are already working on the candidates copy. * Suppress redefinition warning for NOMINMAX on mingw. In my installation, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45. * main : fix indentation * main : pass ctx to llama_token_nl() --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Pretty sure this is just a bug, but it's always possible I'm missing something!