Feature Request: Update GPU operator invocation
cloudymax opened this issue · 5 comments
Looks like Rancher is doing it like this: https://gist.github.com/bgulla/5ea0e7fd310b5db4f9b66036d1cdb3d3
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
&& helm repo update
helm install --wait nvidiagpu \
-n gpu-operator --create-namespace \
--set toolkit.env[0].name=CONTAINERD_CONFIG \
--set toolkit.env[0].value=/var/lib/rancher/k3s/agent/etc/containerd/config.toml \
--set toolkit.env[1].name=CONTAINERD_SOCKET \
--set toolkit.env[1].value=/run/k3s/containerd/containerd.sock \
--set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS \
--set toolkit.env[2].value=nvidia \
--set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \
--set-string toolkit.env[3].value=true \
nvidia/gpu-operator
delete:
helm uninstall -n gpu-operator nvidiagpu
cluster-info:
kubectl get nodes -o wide
I will take this as a feature if you can do the PR
@cloudymax this should first be an app in https://github.com/small-hack/argocd-apps/tree/main and then we can add it to the default config for smol k8s :3
Now that v1.0.0
is officially out, it's much easier to add this to the default applications. Some notes for that:
-
make sure the application is documented in this section of the root README, including adding a small icon, even if you have to make one up :)
-
make sure to set it to disabled by default in default_config.yaml under apps (also make sure it's in alphabetical order)
-
Make sure it's well documented in small-hack/argocd-apps both in the root README.md and in its application directory's README.
@cloudymax I'm marking this as blocked based on your work on this helm chart, but feel free to unblock it when you're ready
Closing based on #58 (comment)