ibuildthecloud/systemd-docker

Wrong cgroup name on CentOS 7

grossws opened this issue · 5 comments

Environment:

  • CentOS 7.0, systemd 208, docker 1.2.0/1.14

Test service unit file /etc/systemd/system/sd-test.service:

[Unit]
Description=test
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker run --rm -v /opt/bin:/opt/bin ibuildthecloud/systemd-docker
ExecStart=/opt/bin/systemd-docker run --rm --name %n busybox /bin/sh -c 'while true ; do date ; sleep 5 ; done'
Type=notify
NotifyAccess=all

[Install]
WantedBy=multi-user.target

If you try to start it you'll have stopped (but not deleted) docker container. In systemctl status sd-test you can see something like this:

[root@test system]# systemctl status sd-test
sd-test.service - test
   Loaded: loaded (/etc/systemd/system/sd-test.service; disabled)
   Active: failed (Result: timeout) since Wed 2014-11-26 14:43:34 UTC; 5min ago
  Process: 11976 ExecStart=/opt/systemd-docker run --rm --name %n busybox /bin/sh -c while true ; do date ; sleep 5 ; done (code=exited, status=1/FAILURE)
 Main PID: 11976 (code=exited, status=1/FAILURE)

Nov 26 14:42:04 ac-test systemd-docker[11976]: 2014/11/26 14:42:04 Moving pid 12006 to /sys/fs/cgroup/systemd/system.slice/sd-test.service/cgroup.procs
Nov 26 14:42:04 ac-test systemd-docker[11976]: 2014/11/26 14:42:04 Moving pid 12006 to /sys/fs/cgroup/blkio/system.slice/sd-test.service/cgroup.procs
Nov 26 14:42:04 ac-test systemd-docker[11976]: 2014/11/26 14:42:04 Moving pid 12006 to /sys/fs/cgroup/freezer/cgroup.procs
Nov 26 14:42:04 ac-test systemd-docker[11976]: 2014/11/26 14:42:04 Moving pid 12006 to /sys/fs/cgroup/devices/system.slice/cgroup.procs
Nov 26 14:42:04 ac-test systemd-docker[11976]: 2014/11/26 14:42:04 Moving pid 12006 to /sys/fs/cgroup/memory/system.slice/cgroup.procs
Nov 26 14:42:04 ac-test systemd-docker[11976]: 2014/11/26 14:42:04 open /sys/fs/cgroup/cpuacct,cpu/system.slice/docker-688bc8168ef5fd406a91c7b2d5522eab3a1e969378857669ba8b50e6a2bb01d2.scope/cgroup.procs: no such file or directory
Nov 26 14:42:04 ac-test systemd[1]: sd-test.service: main process exited, code=exited, status=1/FAILURE
Nov 26 14:43:34 ac-test systemd[1]: sd-test.service stopping timed out. Killing.
Nov 26 14:43:34 ac-test systemd[1]: Failed to start test.
Nov 26 14:43:34 ac-test systemd[1]: Unit sd-test.service entered failed state.

This unit starts fine on:

  • ArchLinux with systemd 217, docker 1.3.1/1.15,
  • CoreOS 509.1.0, systemd 215, docker 1.3.2/1.15.

Also ls /sys/fs/cgroup shows that there's no cpuacct,cpu cgroup (on both centos, arch and coreos). But cpu,cpuacct is present and contains appropriate docker container scope.

Test environment that I used can be bootstrapped by

vagrant box add chef/centos-7.0
vagrant init chef/centos-7.0
vagrant up
# await it's started
vagrant ssh

# in vbox
sudo yum install -y docker

I ran into this and fixed it (for now) by only using the systemd cgroup, i.e. calling systemd-docker with --cgroups name=systemd. This at least allows systemd to track the processes.

Sorry, this issue looks like upstream bug.

I added some logging to getCgroupsForPid and saw that incorrect cgroup is read from /proc/<pid>/cgroup. If I start container using docker, I can see something like this:

[root@test system]# cat /proc/15600/cgroup 
10:hugetlb:/
9:perf_event:/
8:blkio:/system.slice/docker-0bd31f558c426d46f30778cefbc727282e4f257537b42b361ea5d03a44f4febe.scope
7:net_cls:/
6:freezer:/system.slice/docker-0bd31f558c426d46f30778cefbc727282e4f257537b42b361ea5d03a44f4febe.scope
5:devices:/system.slice/docker-0bd31f558c426d46f30778cefbc727282e4f257537b42b361ea5d03a44f4febe.scope
4:memory:/system.slice/docker-0bd31f558c426d46f30778cefbc727282e4f257537b42b361ea5d03a44f4febe.scope
3:cpuacct,cpu:/system.slice/docker-0bd31f558c426d46f30778cefbc727282e4f257537b42b361ea5d03a44f4febe.scope
2:cpuset:/
1:name=systemd:/system.slice/docker-0bd31f558c426d46f30778cefbc727282e4f257537b42b361ea5d03a44f4febe.scope

Available cgroups are:

[root@test system]# ls -l /sys/fs/cgroup/ 
total 0
drwxr-xr-x. 4 root root  0 Nov 26 17:10 blkio
lrwxrwxrwx. 1 root root 11 Nov 25 12:43 cpu -> cpu,cpuacct
lrwxrwxrwx. 1 root root 11 Nov 25 12:43 cpuacct -> cpu,cpuacct
drwxr-xr-x. 4 root root  0 Nov 26 17:10 cpu,cpuacct
drwxr-xr-x. 2 root root  0 Nov 25 12:43 cpuset
drwxr-xr-x. 3 root root  0 Nov 26 17:04 devices
drwxr-xr-x. 3 root root  0 Nov 25 12:43 freezer
drwxr-xr-x. 2 root root  0 Nov 25 12:43 hugetlb
drwxr-xr-x. 3 root root  0 Nov 26 17:10 memory
drwxr-xr-x. 2 root root  0 Nov 25 12:43 net_cls
drwxr-xr-x. 2 root root  0 Nov 25 12:43 perf_event
drwxr-xr-x. 4 root root  0 Nov 25 12:43 systemd

This is a bug in this repo. It is not a kernel or Docker or systemd bug. The kernel makes no guarantee about the ordering of the name of the combined cgroups. But it does provide symlinks for each individual one. So this code should do what libcontainer does and split the name by comma and treat each of those tokens as a separate cgroup.

@cyphar Thanks for your comment! I was able to fix this bug in a fork of systemd-docker I'll publish to github when I have a minute.

@drawks I send a PR to fix this back in 2016 -- #40.