Help! Do I need install driver when machine reboot?
Closed this issue · 2 comments
I try using this device plugin in my cluster and it works fine at the first. I run the driver-installer like docker run --rm installer -v -e ...
only once instead of daemonset. And I found it when my machine reboot, nvidia-smi got error like NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Do I must run driver-installer container as a daemonset to ensure driver works normal?
Yes, please run it using the daemonset.
Not sure if you are using it on GKE, but here's more information on the expected workflow for GKE: https://cloud.google.com/kubernetes-engine/docs/concepts/gpus
@mindprince Thanks. I modify the Dockerfile, let it run in centos, and get this error. I would try move it to the daemonset.