v0.1.4
github-actions
released this
09 Aug 09:07
·
0 commits
to 19694333d0f746688d12112c79f10fffc50aedae
since this release
0.1.4 (2024-08-09)
Features
- append attention kernels for fp8 kv-cache (#420) (906c2f5)
- support min_p sampling (#422) (d52f2da)
- deterministic sampling (#417) (0dd801d)
- more sampling operator options (#431) (68df9c4)
- support fused add rmsnorm (#419) (b781513)
- support fused silu mul (#427) (ea0ba9a)
- feat: support fused gelu tanh mul (#434) (2c9d1c3)
Bug Fixes
- fix dispatch fp16 type when enable fp8 (#430) (daa5566)
- improve numerical stability of sampling kernels (#429) (898d8ea)
Other improvements
Acknowledgement
We thank contributions and feedbacks from the community: @comaniac, @esmeetu, @LiuXiaoxuanPKU, @peng1999, @xslingcn, @Yard1, @zhyncs.