Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

hotfix: fix the decode kernel with logits cap #350

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

yzh119
Copy link
Collaborator

@yzh119 yzh119 commented Jul 3, 2024

logits soft cap should be applied before masking.

Thanks @LiuXiaoxuanPKU for spotting this bug.

@yzh119 yzh119 merged commit f5f7a2a into main Jul 3, 2024
yzh119 added a commit that referenced this pull request Jul 3, 2024
followup of #350 
add the case of `logits_soft_case=1.0` to correctness tests.
add batch decode/prefill tests.
@yzh119 yzh119 deleted the bugfix-decode-logits-cap branch July 3, 2024 07:48
@yzh119 yzh119 mentioned this pull request Jul 3, 2024
yzh119 added a commit that referenced this pull request Jul 3, 2024
##
[0.0.8](v0.0.7...v0.0.8)
(2024-07-03)

### Bugfix

* fix prefill/append kernel behavior for empty kv-cache
([#353](#353))
([7adc8c](7adc8cf))
* fix decode attention kernel with logits cap
([#350](#350))
([f5f7a2](f5f7a2a))
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant