Skip to content

Setting temp=0 does not work as expected #684

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
4 tasks done
abetlen opened this issue Apr 1, 2023 · 3 comments · Fixed by #720
Closed
4 tasks done

Setting temp=0 does not work as expected #684

abetlen opened this issue Apr 1, 2023 · 3 comments · Fixed by #720

Comments

@abetlen
Copy link
Collaborator

abetlen commented Apr 1, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Setting sampling temperature to 0 should produce valid and "predictable" tokens.

Current Behavior

Setting temperature to 0 causes sampling to fail completely. This is due to plogits being scaled by 1.0f/temp before sampling here. I believe a workaround for this would be to make sampling deterministic when temp==0 by setting top_p=0.0 and top_k=1 and setting temp>0.

@Fabio3rs
Copy link
Contributor

Fabio3rs commented Apr 1, 2023

Hi, I did a PR trying to fix this, can you test it?
#689

Thanks

@ivanstepanovftw

This comment was marked as resolved.

@ivanstepanovftw
Copy link
Collaborator

ivanstepanovftw commented Apr 2, 2023

Why not just handle the case explicitly as https://github.com/facebookresearch/llama/compare/main...shawwn:llama:main does?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants