Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Using 4 GPUs for training takes the same time as using just 1 #202

Open
MiguelCosta94 opened this issue Dec 5, 2023 · 1 comment
Open

Comments

@MiguelCosta94
Copy link

MiguelCosta94 commented Dec 5, 2023

I'm training a BigGAN with differential augmentation and LeCam optimization on a custom dataset. My setup features 4 NVIDIA RTX 3070 and I'm running on Ubuntu 20.04. I observe that running the training on the 4 GPUs, using Distributed Data Parallel takes the same time as performing the training using a single GPU. Am I doing something wrong?

For training using a single GPU, I'm using the following command:
CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -hdf5 -l -std_stat -std_max 64 -std_step 64 -metrics fid is prdc -ref "train" -cfg src/configs/VWW/BigGAN-DiffAug-LeCam.yaml -data ../Datasets/vw_coco2014_96_GAN -save SAVE_PATH_VWW -mpc --post_resizer "friendly" --eval_backbone "InceptionV3_tf"

For training using the 4 GPUs, I'm using the following commands:
export MASTER_ADDR=localhost
export MASTER_PORT=1234
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -DDP -tn 1 -cn 0 -std_stat -std_max 64 -std_step 64 -metrics fid is prdc -ref "train" -cfg src/configs/VWW/BigGAN-DiffAug-LeCam.yaml -data ../Datasets/vw_coco2014_96_GAN -save SAVE_PATH_VWW -mpc --post_resizer "friendly" --eval_backbone "InceptionV3_tf"

@MiguelCosta94 MiguelCosta94 changed the title Using 4 GPUs is slower than using just 1 Using 4 GPUs for training takes the same time as using just 1 Dec 5, 2023
@mingukkang
Copy link
Collaborator

Could you please check the batch size used in the training process?

If you are using 1 GPU with a batch size of 256, it is advisable to switch to 4 GPUs, each with a batch size of 64, in order to accelerate training. It's important not to use the 256 batch size for each GPU for faster training.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants