Skip to content

Commit

Permalink
sync with #16860
Browse files Browse the repository at this point in the history
  • Loading branch information
stas00 committed Apr 20, 2022
1 parent 0b74488 commit da03a9b
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions docs/source/en/perf_hardware.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ limitations under the License.

# Custom hardware for training

The hardware you use to run model training and inference can have a big effect on performance. For a deep dive into GPUs make sure to check out Tim Dettmer's excellent [blog post](https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/).
The hardware you use to run model training and inference can have a big effect on performance. For a deep dive into GPUs make sure to check out Tim Dettmer's excellent [blog post](https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/).

Let's have a look at some practical advice for GPU setups.

## GPU
When you train bigger models you have essentially three options:
- bigger GPUs
- more GPUs
- more CPU and NVMe (offloaded to by [DeepSpeed-Infinity](deepspeed#nvme-support))
- more CPU and NVMe (offloaded to by [DeepSpeed-Infinity](main_classes/deepspeed#nvme-support))

Let's start at the case where you have a single GPU.

Expand Down Expand Up @@ -147,4 +147,4 @@ rm -r /tmp/test-clm; CUDA_VISIBLE_DEVICES=0,1 NCCL_P2P_DISABLE=1 python -m torch
```

Hardware: 2x TITAN RTX 24GB each + NVlink with 2 NVLinks (`NV2` in `nvidia-smi topo -m`)
Software: `pytorch-1.8-to-be` + `cuda-11.0` / `transformers==4.3.0.dev0`
Software: `pytorch-1.8-to-be` + `cuda-11.0` / `transformers==4.3.0.dev0`

0 comments on commit da03a9b

Please # to comment.