Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Suppose Gemma model shape #130

Closed
yzh119 opened this issue Feb 21, 2024 · 1 comment
Closed

Suppose Gemma model shape #130

yzh119 opened this issue Feb 21, 2024 · 1 comment

Comments

@yzh119
Copy link
Collaborator

yzh119 commented Feb 21, 2024

Gemma uses head_dim=256 which is enabled in pip wheels by default. We should compile kernels for head_dim=256 and change some kernel parameters for best performance in this case.

yzh119 added a commit that referenced this issue Feb 25, 2024
As mentioned in #130 , the kernels for `head_dim=256` are not compiled
by default, this PR expose these attention kernels to pip wheels and
adds unittests/benchmarks for `head_dim=256`.
@yzh119
Copy link
Collaborator Author

yzh119 commented Feb 25, 2024

Fixed in #132 .

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant