May I ask this error?
Opened this issue · 1 comments
Could you tell me how to solve this problem?
(talk3d) F:\Talk3D>sh demo.sh
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8'
@@@@@@@@@@@@@@@@@@@@@
@ Training Talk3D @
@@@@@@@@@@@@@@@@@@@@@
N_gpus: 1
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8'
Traceback (most recent call last):
File "main.py", line 106, in
spawn_mp(_main, world_size)
File "main.py", line 39, in spawn_mp
mp.spawn(running_fn,args=(world_size,),nprocs=world_size,join=True)
File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes
while not context.join():
File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap
fn(i, *args)
File "F:\Talk3D\main.py", line 29, in _main
setup(rank, world_size,opts)
File "F:\Talk3D\main.py", line 35, in setup
distributed.init_process_group('nccl', rank=rank, world_size=world_size)
File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\distributed\distributed_c10d.py", line 761, in init_process_group
default_pg = _new_process_group_helper(
File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\distributed\distributed_c10d.py", line 886, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL " "built in")
RuntimeError: Distributed package doesn't have NCCL built in
Could you tell me how you installed the torch library?
It looks like your environment's CUDA and torch doesn't match. This installing script
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
or other scripts in this link starting with pip install torch==x.xx.x+cu11x...
such as
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
might help, but I'm not sure.
Please let me know if the above script does not work.