Add Support for Multi-GPU Training with PyTorch Lightning #2338

OrjwanZaafarani · 2023-10-23T12:09:41Z

This pull request introduces support for multi-GPU training in the Sentence Transformers library using PyTorch Lightning.

The following changes have been made:

Updated README.md to include instructions on how to perform multi-GPU training.
Added a new module, SentenceTransformerMultiGPU.py, to enable multi-GPU training.
Included PyTorch Lightning in the requirements.txt file to ensure compatibility.

This enhancement will allow users to leverage the power of multiple GPUs for faster and more efficient training with Sentence Transformers.

Closes the following issues:

OrjwanZaafarani · 2024-02-27T08:06:12Z

Multi-GPU Training

The following snippet of code wraps SentenceTransformer to enable multi-GPU training using Pytorch Lightning:

from SentenceTransformer_MultiGPU import SentenceTransformerMultiGPU
from sentence_transformers import losses
import lightning.pytorch as pl
model = SentenceTransformerMultiGPU("author/any_huggingface_model", losses.CosineSimilarityLoss)
trainer = pl.Trainer(max_epochs = 2,
                     accelerator = 'gpu',
                     devices = 3,
                     strategy = 'ddp_find_unused_parameters_true',
                     log_every_n_steps = 50)
trainer.fit(model = model, train_dataloader = train_dataloader, val_dataloader = val_dataloader)

Please note that the dataloader expects an InputExample object out of the dataset class. For more details, take a look at the main module SentenceTransformer_MultiGPU.py and the example examples/multigpi_training.py.

tomaarsen · 2024-02-27T08:26:47Z

Hello!

Thanks for creating this PR. I am intending to introduce Multi-GPU training via #2449 instead, by relying on the transformers Trainer. The idea is to fully replace the current fit training approach with that Trainer-based approach, while preserving backwards compatibility (i.e. the fit will still work, it just uses the Trainer behind the scenes).

I've kept this open in case I ran into concrete issues with #2449, but I think I would rather move forward with that PR instead.

Tom Aarsen

tomaarsen · 2024-06-03T09:00:47Z

Hello!

Tthis behaviour has now been implemented in the v3.0 update. See https://sbert.net/docs/sentence_transformer/training/distributed.html for more details.

Tom Aarsen

Add MultiGPU training using Pytorch Lightning

868e32f

OrjwanZaafarani changed the title ~~Add MultiGPU training using Pytorch Lightning~~ Add Support for Multi-GPU Training with PyTorch Lightning Oct 23, 2023

b5y mentioned this pull request Jan 24, 2024

Using transformers Trainer to train the models #2446

Open

tomaarsen closed this Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for Multi-GPU Training with PyTorch Lightning #2338

Add Support for Multi-GPU Training with PyTorch Lightning #2338

OrjwanZaafarani commented Oct 23, 2023

OrjwanZaafarani commented Feb 27, 2024

tomaarsen commented Feb 27, 2024

tomaarsen commented Jun 3, 2024

Add Support for Multi-GPU Training with PyTorch Lightning #2338

Add Support for Multi-GPU Training with PyTorch Lightning #2338

Conversation

OrjwanZaafarani commented Oct 23, 2023

OrjwanZaafarani commented Feb 27, 2024

Multi-GPU Training

tomaarsen commented Feb 27, 2024

tomaarsen commented Jun 3, 2024