xinge008/Cylinder3D

Does it currently support distributed multi card training ?

chenrui17 opened this issue · 3 comments

Will it be supported in the future ? current single card training cost too much time

I got the model to run on multiple GPUs, however the training script in this repo is for single GPU.

With current versions of torch / spconv / CUDA the model is a lot faster to train. I rewrote it here for that purpose (for single GPU).

How do I run models on multiple GPUs?

@nakatomo8899 I wrote my own Distributed Data Parallel (DDP) pipeline for this (not open source). I used a combination of Lei Maos cookbook, PyTorch's tutorial, and well documented repos such as Swin in order to do this.