Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: customize logits_soft_cap value #339

Merged
merged 3 commits into from
Jun 28, 2024
Merged

feat: customize logits_soft_cap value #339

merged 3 commits into from
Jun 28, 2024

Conversation

yzh119
Copy link
Collaborator

@yzh119 yzh119 commented Jun 28, 2024

This PR supports customized logits soft cap values. Different models might use different logits soft cap values (e.g. Grok-1 uses 30 and Gemma-2 uses 50).

@yzh119 yzh119 merged commit a2498f5 into main Jun 28, 2024
yzh119 added a commit that referenced this pull request Jun 28, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.0.7](v0.0.6...v0.0.7)
(2024-06-28)

### Bugfix

* fix the `forward_return_lse` function in
`BatchPrefillWithRaggedKVCache` class
([#337](#337))
* fix the scheduler behavior of large page size
([#333](#333))

### Features

* customize `logits_soft_cap` value
([#339](#339))
([a2498f5](a2498f5))


### Performance Improvements

* change minimal `kv_chunk_size` back to 128
([#329](#329))
([f237f5f](f237f5f))
* more options for kv tile size
([#336](#336))
([bf2a6c7](bf2a6c7))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Zihao Ye <expye@outlook.com>
@zhyncs
Copy link
Member

zhyncs commented Jun 29, 2024

Gemma-2 uses 50

Great work!

@yzh119 yzh119 deleted the more-logits-soft-cap branch June 30, 2024 07:14
@yzh119 yzh119 restored the more-logits-soft-cap branch July 3, 2024 03:51
@yzh119 yzh119 deleted the more-logits-soft-cap branch July 3, 2024 07:48
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants