Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

update deepspeed examples commit #15

Merged
merged 2 commits into from
Feb 5, 2020
Merged

Conversation

jeffra
Copy link
Collaborator

@jeffra jeffra commented Feb 5, 2020

No description provided.

@jeffra jeffra requested a review from ShadenSmith February 5, 2020 19:07
@ShadenSmith ShadenSmith merged commit bb91a3e into master Feb 5, 2020
@jeffra jeffra deleted the jeffra/fix_cifar_example branch February 8, 2020 06:49
kouml pushed a commit to kouml/DeepSpeed that referenced this pull request Apr 3, 2020
jeffra pushed a commit to jeffra/DeepSpeed that referenced this pull request May 15, 2020
Support for new apex style optimizer.step(), grad_clip bug fix in Zer…
rraminen pushed a commit to rraminen/DeepSpeed that referenced this pull request Apr 28, 2021
* refactor to use deepspeed init only

* bug fixes

* remove flags that deepspeed now controls

* use micro batch for 16GB v100

* clean up run script

* remove warning print

* remove dead summary writer code

* removed train.py and renamed train_batch_size to micro batch

* remove max grad norm flag

* removed dead datasets, finetune mode, switched to using deepspeed's grad accu check

* remove max steps

* update max epoch

* simplify 512 script

* update micro bsz for 16gb v100

* used deepspeed not pt
liamcli pushed a commit to determined-ai/DeepSpeed that referenced this pull request Sep 27, 2021
* update default configs

* fix bug with onebitadam + p.p hanging

* send tensors to cuda

Co-authored-by: sid <sidney.black@aleph-alpha.de>
pengwa pushed a commit to pengwa/DeepSpeed that referenced this pull request Oct 14, 2022
* add direct meg-ds to hf format script (deepspeedai#110)

* add direct meg-ds to hf format script (part2) (deepspeedai#111)

* add direct meg-ds to hf format script

* split into 2 function

* update the usage doc

* make scripts executable

* add shebang

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants