-
launch One process/per node in SLURM
- use torchrun (torch.distributed.launch)
- use mp.spawn
-
launch directly One process/per GPU use srun
- use srun
- launch one process per GPU in srun
- use torch.distributed.launch
- launch per process use srun
launch per process use srun