nvidia-smi mapped into a container as a blank file when using k8s+containerd+nvidia-container-runtime
drtpotter opened this issue · 3 comments
Hi there,
I'm trying to use nvidia-container-runtime with containerd 1.4.4 under Kubernetes 1.21 but /usr/bin/nvidia-smi seems to be mapped into the container as a blank file. I'm using OpenSUSE Tumbleweed.
If I use CRI-O with the nvidia container runtime under k8s my target container can see my RTX 3060. I can run nvidia-smi, so I'm pretty sure my container is set up correctly. Unfortunately other applications don't seem to work with CRI-O so I am trying to use k8s+Containerd instead.
If I switch to Containerd I can run this command fine.
sudo containerd-ctr -a /var/run/docker/containerd/containerd.sock run --rm --gpus 0 docker.io/nvidia/cuda:11.0-base nvidia-smi nvidia-smi
However when using k8s+containerd it looks like /usr/bin/nvidia-smi is mapped through as a blank file inside the container. I must stress that the runtime works fine under k8s+CRI-O so it appears that under k8s+containerd something has gone wrong with making nvidia-smi available inside the container. Here is my /etc/containerd/config.toml and I'm pretty sure I have followed the NVIDIA directions for patching the file to use the nvidia container runtime.
version = 2
root = "/var/lib/docker/containerd/daemon"
state = "/var/run/docker/containerd/daemon"
plugin_dir = ""
disabled_plugins = []
required_plugins = []
oom_score = 0
[grpc]
address = "/var/run/docker/containerd/containerd.sock"
tcp_address = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
[ttrpc]
address = ""
uid = 0
gid = 0
[debug]
address = ""
uid = 0
gid = 0
level = ""
[metrics]
address = ""
grpc_histogram = false
[cgroup]
path = ""
[timeouts]
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
pause_threshold = 0.02
deletion_threshold = 0
mutation_threshold = 100
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
disable_tcp_service = true
stream_server_address = "127.0.0.1"
stream_server_port = "0"
stream_idle_timeout = "4h0m0s"
enable_selinux = false
selinux_category_range = 1024
sandbox_image = "k8s.gcr.io/pause:3.2"
stats_collect_period = 10
systemd_cgroup = false
enable_tls_streaming = false
max_container_log_line_size = 16384
disable_cgroup = false
disable_apparmor = false
restrict_oom_score_adj = false
max_concurrent_downloads = 3
disable_proc_mount = false
unset_seccomp_profile = ""
tolerate_missing_hugetlb_controller = true
disable_hugetlb_controller = true
ignore_image_defined_volumes = false
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
default_runtime_name = "runc"
no_pivot = false
disable_snapshot_annotations = true
discard_unpacked_layers = false
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
runtime_type = ""
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
runtime_type = ""
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v1"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
BinaryName = "/usr/bin/nvidia-container-runtime"
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
max_conf_num = 1
conf_template = ""
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
[plugins."io.containerd.grpc.v1.cri".image_decryption]
key_model = ""
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
shim = "containerd-shim"
runtime = "runc"
runtime_root = ""
no_shim = false
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.snapshotter.v1.devmapper"]
root_path = ""
pool_name = ""
base_image_size = ""
async_remove = false
Any help or suggestions as to why /usr/bin/nvidia-smi is mapped through as a blank file in the container would be most appreciated!
Kind regards,
Toby
Hi @drtpotter when launching a container on k8s+containerd do you specify a runtime class to ensure that the nvidia-runtime is selected? Note that the --gpus all
flag on the containerd-ctr
command line works differently to how k8s would run a container using containerd.
Some suggestions:
- Try to get the container started using
ctr
and specifying thenvidia
runtime explicitly instead of relying on the--gpus all
flag. - Check whether it works as expected when
nvidia
is set as thedefault_runtime_name
in the containerd config. - Ensure that the podspec for the GPU-enabled pods include a RuntimeClass of
nvidia
(matching the runtime name in containerd).
Hi @elezer, yes changing this line in /etc/containerd/config.toml
default_runtime_name = "runc"
to
default_runtime_name = "nvidia"
fixed the problem. I'd recommend having this change integrated into the containerd section of the nvidia-container-runtime documentation at
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html
Thanks for the suggestions, happy to close this issue!
Thanks @drtpotter. I have added a task to update the docs. Glad that we were able to resolve the issue for you.