Skip to content

Extend FSDP1 global clipping support for optimizers other than Shampoo #2931

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wz337
Copy link
Contributor

@wz337 wz337 commented May 1, 2025

Summary:
This diff is a followup of D73474285 and lets other dense optimizers take it the enable_global_grad_clip optim config. When enalbe_global_grad_clip=True and FSDP1 is used, it would calculate global gradient norm at a cost of extra communication.

Next steps:

  1. global clipping more generic and work for FSDP2.

Differential Revision:
D73969566

Privacy Context Container: L1235913

Summary:
This diff is a followup of D73474285 and lets other dense optimizers take it the `enable_global_grad_clip` optim config. When `enalbe_global_grad_clip=True` and FSDP1 is used, it would calculate global gradient norm at a cost of extra communication. 

Next steps:
1. global clipping more generic and work for FSDP2.

Differential Revision:
D73969566

Privacy Context Container: L1235913
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 1, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73969566

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants