Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

chore(main): release 0.0.5 #232

Merged
merged 5 commits into from
Jun 20, 2024
Merged

chore(main): release 0.0.5 #232

merged 5 commits into from
Jun 20, 2024

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented May 4, 2024

🤖 I have created a release beep boop

0.1.0 (2024-06-20)

Highlights

Acknowledgement

We thank @ibsidorenko, @LiuXiaoxuanPKU, @Yard1 @AgrawalAmey, @xuzhenqi, @mgerstgrasser, @esmeetu, @yz-tang, @HSQ79815, @Qubitium, @shreygupta2809, @sighingnow, @vinx13,
@tqchen, @merrymercy, @comaniac and many others for their contributions and helpful discussions for 0.0.5 release.

Refactor

  • support any GQA group size for tensor-cores kernels (#301) (c111ca)
  • support any page size for tensor-cores kernels (#306) (82fd8c)

Features

  • add use_tensor_cores option to decode kernels to accelerate GQA (#317) (3b50dd5)
  • add group gemm operators (#282) (e08ba42)
  • initial support of distributed operators (#289) (03553da)
  • initial support of logits hook (#298) (ab1e2ad)
  • Separate Q and KV dtypes for decode (#286) (5602659)
  • support cuda graph for batched multi-query(prefill/append) attention (#275) (83ceb67)
  • support cuda graph for batched multi-query(prefill/append) attention (#277) (24cc583)
  • support custom attention mask in prefill/append attention kernels (#266) (7304282)
  • fused speculative sampilng kernels (#259) (cea2bb)
  • expose sampling APIs in pytorch (#238) (092902)

Performance Improvements


This PR was generated with Release Please. See documentation.

@github-actions github-actions bot force-pushed the release-please--branches--main branch from 8516bd5 to f30a8ef Compare May 4, 2024 00:13
@github-actions github-actions bot force-pushed the release-please--branches--main branch 2 times, most recently from e0f50ac to da332d5 Compare May 27, 2024 10:11
@github-actions github-actions bot changed the title chore(main): release 0.0.5 chore(main): release 0.1.0 May 28, 2024
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from c2a4b09 to 2d495a7 Compare June 4, 2024 05:22
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from c2f98d5 to ad28cf8 Compare June 11, 2024 05:15
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from b35df6b to 588ed9d Compare June 20, 2024 06:47
@github-actions github-actions bot force-pushed the release-please--branches--main branch from 588ed9d to d803bed Compare June 20, 2024 08:14
@yzh119 yzh119 changed the title chore(main): release 0.1.0 chore(main): release 0.0.5 Jun 20, 2024
@yzh119 yzh119 merged commit 5c05676 into main Jun 20, 2024
Copy link
Contributor Author

@yzh119 yzh119 deleted the release-please--branches--main branch June 20, 2024 17:15
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant