cuda run time error when set gpu_idx from 0 to other numbers
hhhharold opened this issue · 2 comments
hhhharold commented
If I change the gpu_idx from 0 to other numbers (for example, set gpu idx 5) in train.sh, there is an error that is shown below. If I set the gpu_idx 0, the script can run normally. I am confused with this bug.
THCudaCheck FAIL file=/pytorch/torch/csrc/cuda/Module.cpp line=59 error=101 : invalid device ordinal
Traceback (most recent call last):
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (101) : invalid device ordinal at /pytorch/torch/csrc/cuda/Module.cpp:59
maudzung commented
Hi @hhhharold ,
How many GPUs in your computer? You used gpu_idx=5, it meant that you have at lease 6 GPUs, am I right?