using multi-GPU for training step

Question

using multi-GPU for training step

rafal7466 opened this issue 2 years ago · 1 comments

Hi
Is there possible to use multi-GPU to decrease training time in your implementation?

Answer 1 · 2022-10-21T09:56:12.000Z

Hi,
we did not implement the multi-GPU training, because training the model is already very fast (around 1 day on a modern GPU). I think it should't be too difficult to do it with DistributedDataParallel, but we have no plans to do it.