Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

split single_gpu and multi_gpu #17083

Merged
merged 2 commits into from
May 9, 2022
Merged

split single_gpu and multi_gpu #17083

merged 2 commits into from
May 9, 2022

Conversation

ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented May 4, 2022

What does this PR do?

Fix the scheduled CI issue caused by the 256 limits (jobs generated from matrix).

Note that the workflow run page has a graph that has no single-gpu and multi-gpu on it. But on the left side, the job names have matrix mentioned.

Screenshot 2022-05-04 145601

@ydshieh ydshieh requested a review from LysandreJik May 4, 2022 12:45
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 4, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR! Did you launch it as a trial to see if it works? I see the following should be completed, on line 303:

[setup, run_tests_gpu, run_examples_gpu, run_pipelines_tf_gpu, run_pipelines_torch_gpu, run_all_tests_torch_cuda_extensions_gpu]

@ydshieh
Copy link
Collaborator Author

ydshieh commented May 5, 2022

Thanks for your PR! Did you launch it as a trial to see if it works? I see the following should be completed, on line 303:

[setup, run_tests_gpu, run_examples_gpu, run_pipelines_tf_gpu, run_pipelines_torch_gpu, run_all_tests_torch_cuda_extensions_gpu]

You are right, that line should be changed. I haven't launched it (just tried with a dummy example). I will launch it now.

@LysandreJik
Copy link
Member

You can launch it with only 1-2 models in each run, for example by updating this line:

          echo "::set-output name=matrix::$(python3 -c 'import os; tests = os.getcwd(); model_tests = os.listdir(os.path.join(tests, "models")); d1 = sorted(list(filter(os.path.isdir, os.listdir(tests)))); d2 = sorted(list(filter(os.path.isdir, [f"models/{x}" for x in model_tests]))); d1.remove("models"); d = d2 + d1; print(d)')"

to

          echo "::set-output name=matrix::$(python3 -c 'import os; tests = os.getcwd(); model_tests = os.listdir(os.path.join(tests, "models"))[:2]; d1 = sorted(list(filter(os.path.isdir, os.listdir(tests)))); d2 = sorted(list(filter(os.path.isdir, [f"models/{x}" for x in model_tests]))); d1.remove("models"); d = d2 + d1; print(d)')"

This way you'll test the full behavior without having 12-hour long iterations.

@ydshieh
Copy link
Collaborator Author

ydshieh commented May 5, 2022

It took sometime, but the run looks good.

https://github.com/huggingface/transformers/actions/runs/2276209307

@ydshieh ydshieh force-pushed the fix_scheduled_ci_256_limit branch from a9c25d1 to 493b384 Compare May 6, 2022 07:13
@LysandreJik
Copy link
Member

Looks good, thanks @ydshieh!

@LysandreJik LysandreJik merged commit 3212afa into main May 9, 2022
@LysandreJik LysandreJik deleted the fix_scheduled_ci_256_limit branch May 9, 2022 11:13
nandwalritik pushed a commit to nandwalritik/transformers that referenced this pull request May 10, 2022
* split single_gpu and multi_gpu

* update needs in send_result

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Narsil pushed a commit to Narsil/transformers that referenced this pull request May 12, 2022
* split single_gpu and multi_gpu

* update needs in send_result

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
* split single_gpu and multi_gpu

* update needs in send_result

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants