Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[GELU] Add f32/x4, f16/x2/x8/x8pack kernel. #66

Merged
merged 5 commits into from
Oct 11, 2024
Merged

Conversation

bear-zd
Copy link
Contributor

@bear-zd bear-zd commented Oct 10, 2024

Saw the mention of GELU in the issue, so I worked on it. There is no GELU implementation in torch for half precision (the reason is explained in readme.md), so I implemented some corresponding approximation algorithms.

@DefTruth
Copy link
Owner

LGTM

Copy link
Owner

@DefTruth DefTruth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢贡献~ 我format了一下,性能结果用我手上的机器重跑后更新了。

@DefTruth DefTruth merged commit 1eae888 into DefTruth:main Oct 11, 2024
@DefTruth DefTruth mentioned this pull request Oct 11, 2024
43 tasks
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants