linuxkit/kubernetes

all cadvisor metrics have id="/"

Closed this issue · 6 comments

Description

cAdvisor metrics do not have correct cgroup path.

Describe the results you received:

The metrics can be obtain with the following command:

curl -s -k https://localhost:6443/api/v1/nodes/linuxkit-025000000002/proxy/metrics/cadvisor --cert /etc/kubernetes/pki/apiserver-kubelet-client.crt --key /etc/kubernetes/pki/apiserver-kubelet-client.key

What you will see is e.g.

container_memory_max_usage_bytes{container_name="",id="/",image="",name="",namespace="",pod_name=""} 3.03910912e+08

Note id="/".

Describe the results you expected:

On an Ubuntu install, you will see more metrics instead with different ids, e.g.:

container_memory_max_usage_bytes{container_name="weave",id="/kubepods/burstable/pod7db61ed7-e655-11e7-a92e-065f2a149e22/1c7e2c87fbdbf35542a2e060147b245455c45ca3cada8c68a9d730a12551d46e",image="weaveworks/weave-kube@sha256:07a3d56b8592ea3e00ace6f2c3eb7e65f3cc4945188a9e2a884b8172e6a0007e",name="k8s_weave_weave-net-vlw97_kube-system_7db61ed7-e655-11e7-a92e-065f2a149e22_1",namespace="kube-system",pod_name="weave-net-vlw97"} 8.4353024e+07

Additional information you deem important (e.g. issue happens only occasionally):

Kubernetes version 1.9.0 was used in both cases. Ubuntu-based cluster was installed using weaveworks/kubernetes-ami#15.

See details here https://gist.github.com/errordeveloper/2847ea94df2b2b0cccb60f0a6aa2b20f.

This is probably to do with kubelet running in a container, and cAdvisor gets somehow confused...

w9n commented

I assume this is only a docker problem as handled in #23 and #11. Did you check cri-containerd?

@w9n yes, I actually noticed it with cri-containerd first.

ijc commented

For my own future reference this new rune doesn't hardcode the hostname which improves cut-and-paste-ability:

curl -s -k https://localhost:6443/api/v1/nodes/$(uname -n)/proxy/metrics/cadvisor --cert /etc/kubernetes/pki/apiserver-kubelet-client.crt --key /etc/kubernetes/pki/apiserver-kubelet-client.key | grep container_memory_max_usage_bytes

Output is:

container_memory_max_usage_bytes{container_name="",id="/",image="",name="",namespace="",pod_name=""} 5.14842624e+08
ijc commented

FWIW #46 makes no difference to this issue (I wasn't sure if I should expect it to).

ijc commented

I think this is because kubelet has it's own /sys/fs/cgroups (equivalent to the hosts /sys/fs/cgroups/*/podruntime/kubelet). I'm testing some changes which instead bind the root cgroups hierarchy into kubelet + cri|docker now and will PR.