Running on server with multiple nodes
meidachen opened this issue · 1 comments
meidachen commented
Hi,
Thank you for the great work. I'm trying to run this repo on our school HPC, but it will hang when I use multiple nodes. Specifically, it will hang around this part:
"mp.spawn(main_worker, nprocs=args.ngpus_per_node, args=(args.ngpus_per_node, args))"
Since each node in our school HPC only has 2 GPUs, I would like to use more nodes to run this. Could you please help and direct me to where this issue could come from?
Thank you!
Abbsalehi commented
@meidachen did you solve your issue?