If cudo finds only 1 device it crashes app.py immidietly
Aultus-defora opened this issue · 2 comments
def_rank in lib/sync.py does modulo by 0 error when torch.cuda.device_count() returns 0
"The cuda devices are numbered from 0 and up."
At least that is what alabanD answered about that question in https://discuss.pytorch.org/t/torch-cuda-device-count-always-return-0/135632
If there is only one GPU it will be 0, crashing immidiatly when run from app.py due to modulo by 0
I tried with a one-gpu machine, the torch.cuda.device_count() should return 1...
The question in https://discuss.pytorch.org/t/torch-cuda-device-count-always-return-0/135632 why it returns 0 is because os.environ["CUDA_VISIBLE_DEVICES"]= "3"
(which means a machine with only one GPU cannot find GPU ID=3)
I guess then machine I used just cannot use Cuda.