-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
unify how to freeze some parameters for coca pre-training #526
Conversation
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
f4a1103
to
3953609
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
3953609
to
44b179b
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #526 +/- ##
=======================================
Coverage 75.61% 75.62%
=======================================
Files 234 234
Lines 16122 16126 +4
=======================================
+ Hits 12191 12195 +4
Misses 3931 3931 ☔ View full report in Codecov by Sentry. |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
44b179b
to
da89229
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
da89229
to
88933e9
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
88933e9
to
abc1037
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
abc1037
to
dbeed97
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
Summary: 1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). 2. add configuration of using MLP instead of attention pooler for vision adapter; Differential Revision: D55897450 Privacy Context Container: 303860477774201
Summary: Pull Request resolved: #527 Pull Request resolved: #526 1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). 2. add configuration of using MLP instead of attention pooler for vision adapter; Reviewed By: Bellaktris Differential Revision: D55897450 Privacy Context Container: 303860477774201 fbshipit-source-id: 8e012b0c3d37566364f216dbfa8aec389142afe1
Summary:
Differential Revision:
D54559503
Privacy Context Container: 303860477774201