Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

perf: accelerate alibi #365

Merged
merged 4 commits into from
Jul 10, 2024
Merged

perf: accelerate alibi #365

merged 4 commits into from
Jul 10, 2024

Conversation

yzh119
Copy link
Collaborator

@yzh119 yzh119 commented Jul 10, 2024

Alibi experienced a performance degradation after #262 because of increased number of integer division.
This PR fixes the issue.

@yzh119 yzh119 merged commit 4f0a9f9 into main Jul 10, 2024
yzh119 added a commit that referenced this pull request Jul 12, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.0.9](v0.0.8...v0.0.9)
(2024-07-12)

### Bugfix

* fix the decode kernel segfault in cudagraph mode
([#368](https://github.com/flashinfer-ai/flashinfer/pull/368))([c69cfa](https://github.com/flashinfer-ai/flashinfer/commit/c69cfabc540e4a7edd991713df10d575ff3b0c21))
- fix decode kernels output for empty kv cache
([#363](https://github.com/flashinfer-ai/flashinfer/pull/363))([ac72b1](https://github.com/flashinfer-ai/flashinfer/commit/ac72b1cc14a6474d601f371c8d69e2600ac28d2f))
- check gpu id in PyTorch APIs and use input tensor's gpu default stream
([#361](https://github.com/flashinfer-ai/flashinfer/pull/361))([1b84fa](https://github.com/flashinfer-ai/flashinfer/commit/1b84fab3e4f53fb4fa26952fdb46fa8018634057))

### Performance Improvements

* accelerate alibi
([#365](#365))
([4f0a9f9](4f0a9f9))
* accelerate gqa performance
([#356](#356))
([e56ddad](e56ddad))
* Optimize tensor conversions in C++ code to avoid unnecessary copies
([#366](#366))
([1116237](1116237))

### Acknowledgement

We thank [@Yard1](https://github.com/Yard1),
[@Ying1123](https://github.com/Ying1123) and
[@zhyncs](https://github.com/zhyncs) for their contributions.

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Zihao Ye <expye@outlook.com>
@yzh119 yzh119 deleted the alibi-acceleration branch July 24, 2024 10:38
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant