You do not need multi gpus, the code uses the distributed package from pytorch, so you need to start it with the distributed packages starting lines. I think with a 3050 you might need to reduce quite a bit the batchsize.

Question

You do not need multi gpus, the code uses the distributed package from pytorch, so you need to start it with the distributed packages starting lines. I think with a 3050 you might need to reduce quite a bit the batchsize.

Closed this issue 9 months ago · 3 comments

          You do not need multi gpus, the code uses the distributed package from pytorch, so you need to start it with the distributed packages starting lines. I think with a 3050 you might need to reduce quite a bit the batchsize.

Originally posted by @TontonTremblay in #318 (comment)

I tried to run. When I use python -m torch.distributed.launch --nproc_per_node=1 train.py --network dope --epochs 2 --batchsize 1 --outf tmp/ --data ./example command, I got RuntimeError: No rendezvous handler for env://
I think this has happened cause I'm using window 10. Or is there anoter reason??

Answer 1 · 2023-09-12T04:09:07.000Z

Hard for me to know, it might also be that a 3050 at the end of the day is not the best equipped to do deep learning.

Answer 2 · 2023-09-20T03:27:50.000Z

Sounds like it is maybe something with your environment. Try updating your drivers, packages, and maybe even the operating system. Maybe switch to Linux. It is what I use.

Also, check and make sure you are not running out of memory on your GPU

Answer 3 · 2023-10-09T00:02:59.000Z

Thanks for answer. I will check my drivers and try to train again.