Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

sampling: support min_p sampling #422

Merged
merged 5 commits into from
Aug 8, 2024
Merged

Conversation

xslingcn
Copy link
Contributor

@xslingcn xslingcn commented Aug 6, 2024

This PR supports min_p sampling by adding sampling.min_p_sampling_from_probs API.

  • Implement kernel
  • Add Tests

Ref: Min P Sampling.

@yzh119 yzh119 merged commit d52f2da into flashinfer-ai:main Aug 8, 2024
@yzh119 yzh119 mentioned this pull request Aug 9, 2024
yzh119 added a commit that referenced this pull request Aug 9, 2024
🤖 I have created a release *beep* *boop*
---
##
[0.1.4](v0.1.3...v0.1.4)
(2024-08-09)


### Features

* append attention kernels for fp8 kv-cache
([#420](#420))
([906c2f5](906c2f5))
* support min_p sampling
([#422](#422))
([d52f2da](d52f2da))
* deterministic sampling
([#417](#417))
([0dd801d](0dd801d))
* more sampling operator options
([#431](#431))
([68df9c4](68df9c4))
* support fused add rmsnorm
([#419](#419))
([b781513](b781513))
* support fused silu mul
([#427](#427))
([ea0ba9a](ea0ba9a))

### Bug Fixes

* fix dispatch fp16 type when enable fp8
([#430](#430))
([daa5566](daa5566))
* improve numerical stability of sampling kernels
([#429](#429))
([898d8ea](898d8ea))

### Other improvements
* break up `_kernels` into multiple modules
([#428](#428))
([8e482d9](8e482d9))

### Acknowledgement

We thank contributions and feedbacks from the community:
[@comaniac](https://github.com/comaniac),
[@esmeetu](https://github.com/esmeetu),
[@LiuXiaoxuanPKU](https://github.com/LiuXiaoxuanPKU),
[@peng1999](https://github.com/peng1999),
[@xslingcn](https://github.com/xslingcn),
[@Yard1](https://github.com/Yard1),
[@zhyncs](https://github.com/zhyncs).

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Zihao Ye <expye@outlook.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants