Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

SGDW optimizer #2053

Merged
merged 3 commits into from
Dec 11, 2023
Merged

SGDW optimizer #2053

merged 3 commits into from
Dec 11, 2023

Conversation

rwightman
Copy link
Collaborator

No description provided.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@rwightman rwightman merged commit 711c5de into main Dec 11, 2023
24 checks passed
@hiyyg
Copy link

hiyyg commented Dec 12, 2023

What is the paper refenrence for sgdw?

@rwightman
Copy link
Collaborator Author

@hiyyg just the old decoupled decay paper that covered AdamW https://arxiv.org/abs/1711.05101

Been running some weight decay experiment runs on the side and seems sgdw cam be worthwhile...

@rwightman rwightman deleted the sgdw branch July 18, 2024 23:46
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants