Request for DistributedDataParallel

Question

Request for DistributedDataParallel

sladenheim opened this issue 9 months ago · 1 comments

To allow for the training of larger models split across GPUS using the DefaultTrainer class, implement model parallel capabilities using the torch.DistributedDataParallel functionality.

Answer 1 · 2024-03-21T18:52:07.000Z

Thanks for opening this @sladenheim. We are currently focusing on a new release of micro_sam, but will then look into this.