Does it currently support distributed multi card training ?
chenrui17 opened this issue · 3 comments
chenrui17 commented
Will it be supported in the future ? current single card training cost too much time
L-Reichardt commented
I got the model to run on multiple GPUs, however the training script in this repo is for single GPU.
With current versions of torch / spconv / CUDA the model is a lot faster to train. I rewrote it here for that purpose (for single GPU).
nakatomo8899 commented
How do I run models on multiple GPUs?
L-Reichardt commented
@nakatomo8899 I wrote my own Distributed Data Parallel (DDP) pipeline for this (not open source). I used a combination of Lei Maos cookbook, PyTorch's tutorial, and well documented repos such as Swin in order to do this.