Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

unify how to freeze some parameters for coca pre-training #526

Closed

Conversation

zhangtemplar
Copy link
Contributor

Summary:

  1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
  2. add configuration of using MLP instead of attention pooler for vision adapter;
  3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 13, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 14, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 14, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

@codecov-commenter
Copy link

codecov-commenter commented Mar 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.62%. Comparing base (dbeed97) to head (88933e9).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #526   +/-   ##
=======================================
  Coverage   75.61%   75.62%           
=======================================
  Files         234      234           
  Lines       16122    16126    +4     
=======================================
+ Hits        12191    12195    +4     
  Misses       3931     3931           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 20, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 21, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 29, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Apr 8, 2024
Summary:

1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).
2. add configuration of using MLP instead of attention pooler for vision adapter;

Differential Revision:
D55897450

Privacy Context Container: 303860477774201
facebook-github-bot pushed a commit that referenced this pull request Apr 25, 2024
Summary:
Pull Request resolved: #527

Pull Request resolved: #526

1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).
2. add configuration of using MLP instead of attention pooler for vision adapter;

Reviewed By: Bellaktris

Differential Revision:
D55897450

Privacy Context Container: 303860477774201

fbshipit-source-id: 8e012b0c3d37566364f216dbfa8aec389142afe1
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants