train-saturn.py debugging
otoky opened this issue · 2 comments
Hi! I followed the tutorial until the train-saturn section- I am using google colab and have pip imported all the variables and am running on a GPU enabled virtual machine.
(!pip install torch==1.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html)
When i run it, I am getting error:
File "/gdrive/MyDrive/SATURN/files/train-saturn.py", line 1050, in
torch.cuda.set_device(args.device_num)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
not very well versed in this stuff, but is there an issue with the code recognizing which GPU?
Invalid device ordinal means it is trying to set the device number to a gpu that is not on the machine I think. Try changing device_num to 0.
Otherwise there might be an issue with torch install.
i think i got it to work thanks!