Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix MQA V2 #2388

Merged
merged 2 commits into from
Jan 2, 2025
Merged

Fix MQA V2 #2388

merged 2 commits into from
Jan 2, 2025

Conversation

laclouis5
Copy link
Contributor

This PR fixes the MultiQueryAttentionV2 module with has several issues:

  • It does not scale q @ k,
  • The output transpose was missing,
  • the output reshape used the input dim instead of the output_dim.

Copy link

@anukaal anukaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@rwightman rwightman merged commit d23facd into huggingface:main Jan 2, 2025
22 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants