Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Update sdpa function with enable_gqa=True #191

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

jainapurva
Copy link

For the llama model, in the sdpa function call, set enable_gqa=True to use the inbuilt grouped query attention functionality

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 13, 2024
@jainapurva jainapurva requested a review from drisspg July 13, 2024 03:56
@yanboliang
Copy link
Contributor

yanboliang commented Jul 26, 2024

I think we should wait a bit to get this in, since a lot of users are still using the old version of PT which doesn't support enable_gqa. But I'm interested how much perf gain we have after enabling the builtin gqa, do you have numbers on A100?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants