Skip to content

Commit

Permalink
do bcast only pp_group_size>1 (#3915)
Browse files Browse the repository at this point in the history
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
  • Loading branch information
inkcherry and loadams authored Jul 14, 2023
1 parent 7528035 commit 05a6cee
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions deepspeed/runtime/pipe/engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -524,8 +524,8 @@ def _aggregate_total_loss(self):

assert self.global_rank in self.grid.pp_group
losses = torch.Tensor([self.dp_group_loss, agg_loss]).to(self.device)
dist.broadcast(tensor=losses, src=self.global_rank, group=self.mpu.get_pipe_parallel_group())

if self.is_pipe_parallel:
dist.broadcast(tensor=losses, src=self.global_rank, group=self.mpu.get_pipe_parallel_group())
else:
# Get loss from last stage
src_rank = self.grid.stage_to_global(self.num_stages - 1)
Expand Down

0 comments on commit 05a6cee

Please # to comment.