Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: Add mask to merge_state_in_place #372

Merged
merged 1 commit into from
Jul 13, 2024

Conversation

Yard1
Copy link
Contributor

@Yard1 Yard1 commented Jul 13, 2024

This pushes down the conditional logic to the kernel, allowing for better CUDA graph support with variable sequence length. I didn't see much purpose in adding the mask parameter to the out of place merge state kernels.

Copy link
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mask looks like a good feature to have, thank you @Yard1 !

@yzh119 yzh119 merged commit e14fa81 into flashinfer-ai:main Jul 13, 2024
@Yard1 Yard1 deleted the cascade_with_mask branch July 13, 2024 02:53
yzh119 pushed a commit that referenced this pull request Jul 17, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](v0.0.9...v0.1.0)
(2024-07-17)


### Features

* Add mask to `merge_state_in_place`
([#372](#372))
([e14fa81](e14fa81))
* expose pytorch api for block sparse attention
([#375](#375))
([4bba6fa](4bba6fa))
* Fused GPU sampling kernel for joint top-k & top-p sampling
([#374](#374))
([6e028eb](6e028eb))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants