Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Using explicit GPU upcast for ZeRO-Offload #6962

Merged
merged 1 commit into from
Jan 21, 2025

Conversation

xylian86
Copy link
Contributor

@xylian86 xylian86 commented Jan 20, 2025

Following discussion in PR-6670, the explict upcast is much more efficient than implicit upcast, this PR is to replace implicit upcast with explict one.

The results on 3B model are shown below:

Option BWD (ms) Speed up
Before PR-6670 25603.30 1x
After PR-6670 1174.31 21.8X
After this PR 309.2 82.8X

@loadams loadams enabled auto-merge January 21, 2025 18:16
@loadams loadams added this pull request to the merge queue Jan 21, 2025
Merged via the queue into deepspeedai:master with commit c17dc33 Jan 21, 2025
13 checks passed
tjruwase pushed a commit that referenced this pull request Feb 6, 2025
Following discussion in
[PR-6670](#6670), the explict
upcast is much more efficient than implicit upcast, this PR is to
replace implicit upcast with explict one.

The results on 3B model are shown below:

| Option | BWD (ms) | Speed up |
|------------|-----|------|
| Before PR-6670 | 25603.30 | 1x |
| After PR-6670 | 1174.31 | 21.8X |
| After this PR| 309.2 | 82.8X |

Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com>
siqi654321 pushed a commit to siqi654321/DeepSpeed that referenced this pull request Feb 7, 2025
Following discussion in
[PR-6670](deepspeedai#6670), the explict
upcast is much more efficient than implicit upcast, this PR is to
replace implicit upcast with explict one.

The results on 3B model are shown below:

| Option | BWD (ms) | Speed up |
|------------|-----|------|
| Before PR-6670 | 25603.30 | 1x |
| After PR-6670 | 1174.31 | 21.8X |
| After this PR| 309.2 | 82.8X |

Signed-off-by: siqi <siqi@tecorigin.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants