torch.nn.parallel.DistributedDataParallel hang on
Crd1140234468 opened this issue · 5 comments
Crd1140234468 commented
Crd1140234468 commented
This is a function inside torch
Crd1140234468 commented
Also, here's the problem I'm having with multiple GPUs
MC-E commented
what's the command you run?
Crd1140234468 commented
what's the command you run?
CUDA_VISIBLE_DEVICES=1,3 python -m torch.distributed.launch --nproc_per_node=2 --master_port 8888 test11.py --bsize=8
Crd1140234468 commented
what's the command you run?
Currently, model_ad can be loaded into torch.nn.parallel.DistributedDataParallel, but when the model is set to sd-v1-4.ckpt, it cannot be loaded into torch.nn.parallel.DistributedDataParallel, and it will get stuck.