linuxkit/kubernetes

Failed to get cgroup stats for docker and kubelet in kubelet container

w9n opened this issue · 5 comments

w9n commented

Description

This log gets thrown every 10sek.
Steps to reproduce the issue:
run docker cluster and check /var/log/kubelet.err.log

Describe the results you received:
many

E1127      617 summary.go:92] Failed to get system container stats for "/kubelet": failed to get cgroup stats for "
/kubelet": failed to get container info for "/kubelet": unknown container "/kubelet"
E1127      617 summary.go:92] Failed to get system container stats for "/docker": failed to get cgroup stats for "/
docker": failed to get container info for "/docker": unknown container "/docker"

Describe the results you expected:
no errors

Additional information you deem important (e.g. issue happens only occasionally):
also noticed instable metrics in grafana 1-2 weeks ago but needs further research

ijc commented

@justincormack you were looking at cgroups in the context of kubelet recently, does this ring any bells?

Seems like kubelet is trying to introspect containers which it didn't create and has no business with, but perhaps it can be satisfied somehow?

Yeah, I need to rebase my patches for cgroups against this new repo, which may help.

w9n commented

With #14 the docker container crashes with:

[WARN  tini (536)] Tini is not running as PID 1 and isn't registered as a child subreaper.
Zombie processes will not be re-parented to Tini, so zombie reaping won't work.
To fix the problem, use the -s option or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run
 Tini as PID 1.
time="2017-11-29T17:26:38.586725083Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not f
ound"
time="2017-11-29T17:26:38.589577752Z" level=info msg="libcontainerd: new containerd process, pid: 659"
time="2017-11-29T17:26:40.495580579Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2017-11-29T17:26:40.496412329Z" level=info msg="Loading containers: start."
time="2017-11-29T17:26:40.502136245Z" level=warning msg="Running modprobe nf_nat failed with message: `ip: can't find device 'nf_n
at'\nmodprobe: module nf_nat not found in modules.dep`, error: exit status 1"
time="2017-11-29T17:26:40.505753597Z" level=warning msg="Running modprobe xt_conntrack failed with message: `ip: can't find device
 'xt_conntrack'\nmodprobe: module xt_conntrack not found in modules.dep`, error: exit status 1"
time="2017-11-29T17:26:40.618113153Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemo
n option --bip can be used to set a preferred IP address"
time="2017-11-29T17:26:40.660023730Z" level=info msg="Loading containers: done."
time="2017-11-29T17:26:40.709343027Z" level=info msg="Docker daemon" commit=f4ffd25 graphdriver(s)=overlay2 version=17.10.0-ce
time="2017-11-29T17:26:40.709612072Z" level=info msg="Daemon has completed initialization"
time="2017-11-29T17:26:40.734692152Z" level=info msg="API listen on /var/run/docker.sock"

EDIT:
Its the same log output as normal but ctr t ls returns
docker 534 STOPPED

I didnt see any error in stdout from containerd but it possibly got truncated from my console

and the kubelet does not get created.

Those errors are all harmless, and it says docker is listening, so something weird is going on.

w9n commented

fixed by #49