-
Notifications
You must be signed in to change notification settings - Fork 539
Add qk norm optionally before attention calculation #8820
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8820
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 Cancelled JobsAs of commit 5b2587a with merge base 73acde9 ( CANCELLED JOBS - The following jobs were cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D70355802 |
@pytorchbot label "topic: not user facing" |
e932c2d
to
70f8a2d
Compare
Summary: Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Differential Revision: D70355802
This pull request was exported from Phabricator. Differential Revision: D70355802 |
Summary: Pull Request resolved: pytorch#8820 Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Differential Revision: D70355802
70f8a2d
to
4d64a51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you @madhu-fb !
Summary: Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
4d64a51
to
44f9842
Compare
Summary: Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
44f9842
to
f850b04
Compare
This pull request was exported from Phabricator. Differential Revision: D70355802 |
Summary: Pull Request resolved: pytorch#8820 Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
f850b04
to
66f92c5
Compare
Summary: Pull Request resolved: pytorch#8820 Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
66f92c5
to
90b24b1
Compare
This pull request was exported from Phabricator. Differential Revision: D70355802 |
90b24b1
to
0cb1641
Compare
Summary: Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
This pull request was exported from Phabricator. Differential Revision: D70355802 |
Summary: Pull Request resolved: pytorch#8820 Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
51a970b
to
388f54d
Compare
Summary: Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
Summary: Pull Request resolved: pytorch#8820 Some of the new llama checkpoints developed by genai use an additional qk_norm in the attention calculation. To run these models with executorch and have parity with server models, we require an optional qk norm in the ET attention. Refactoring RMSNorm into a separate file so that there is no circular dependency between attention and llama_transformer Reviewed By: iseeyuan Differential Revision: D70355802
This pull request was exported from Phabricator. Differential Revision: D70355802 |
388f54d
to
5b2587a
Compare
Differential Revision: D70355802