[ChatLLaMA] No GPU Detected issue
MuffinC opened this issue · 2 comments
When trying to run the training step python artifacts/main.py artifacts/config/config.yaml --type ALL
It replies with ValueError("No Gpu available") . Is there anyone with advice on this? Currently trying to run in a azure cloud gpu vm. The gpu is NVIDIA Corporation GP100GL [Tesla P100 PCIe 16gb]. If there is any more information required please do reach out thanks!
Hi @MuffinC, thank you for reaching out. It looks like there is something missing on the GPU setup. Could you please share with us the results of the two following commands?
nvidia-smi
and
python -c "import torch; print(torch.cuda.is_available())"
Hi diego, thanks for leading me in the right direction. Managed to solve the issue by running the following commands:
apt-get remove --purge '^nvidia-.'
sudo apt-get install ubuntu-desktop
apt-get --purge remove "cublas" "cuda"
apt-get --purge remove "nvidia"
sudo rm /etc/X11/xorg.conf
sudo apt autoremove
reboot
ubuntu-drivers devices
ubuntu-drivers autoinstall
reboot
nvidia-smi
Initially nvidia-smi was showing and error, and torch.cuda was returning Falst. Now it returns true