Feature Request: Update GPU operator invocation

Question

Feature Request: Update GPU operator invocation

cloudymax opened this issue a year ago · 5 comments

Looks like Rancher is doing it like this: https://gist.github.com/bgulla/5ea0e7fd310b5db4f9b66036d1cdb3d3

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
&& helm repo update

helm install --wait nvidiagpu \
     -n gpu-operator --create-namespace \
    --set toolkit.env[0].name=CONTAINERD_CONFIG \
    --set toolkit.env[0].value=/var/lib/rancher/k3s/agent/etc/containerd/config.toml \
    --set toolkit.env[1].name=CONTAINERD_SOCKET \
    --set toolkit.env[1].value=/run/k3s/containerd/containerd.sock \
    --set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS \
    --set toolkit.env[2].value=nvidia \
    --set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \
    --set-string toolkit.env[3].value=true \
     nvidia/gpu-operator

delete:
helm uninstall -n gpu-operator nvidiagpu

cluster-info:
kubectl get nodes -o wide

Answer 1 · 2023-08-03T09:28:27.000Z

I will take this as a feature if you can do the PR

Answer 2 · 2023-08-12T15:37:48.000Z

@cloudymax this should first be an app in https://github.com/small-hack/argocd-apps/tree/main and then we can add it to the default config for smol k8s :3

Answer 3 · 2023-09-10T08:10:51.000Z

Now that v1.0.0 is officially out, it's much easier to add this to the default applications. Some notes for that:

make sure the application is documented in this section of the root README, including adding a small icon, even if you have to make one up :)
make sure to set it to disabled by default in default_config.yaml under apps (also make sure it's in alphabetical order)
Make sure it's well documented in small-hack/argocd-apps both in the root README.md and in its application directory's README.

Answer 4 · 2023-12-01T12:53:48.000Z

@cloudymax I'm marking this as blocked based on your work on this helm chart, but feel free to unblock it when you're ready

Answer 5 · 2023-12-02T12:02:57.000Z

Closing based on #58 (comment)