
Error while removing dangling images

ap4y opened this issue · 4 comments

ap4y commented

I'm using custom script for updating containers, as a last step I'm cleaning dangling (docker images -f 'dangling=true'). When using systemd-docker in unit files, dangling images fail to remove for several seconds after unit restarts, after some time (about 5s) it works. Steps to reproduce:

  1. Pull new image;
  2. Restart unit to make it use new image;
  3. Try to clean dangling images (for example docker rmi $(docker images -q -f 'dangling=true')).

I think that:

  1. Ideally docker rmi should not fail on images returned by docker images -q -f 'dangling=true', or
  2. docker images -q -f 'dangling=true' should not return such images for container used by system-docker.

Would appreciate any suggestion about how I can solve/debug somehow this situation. Thanks.

@ap4y Whatever container you are running, do systemd-cgls and see if all of the containers processes are in the services cgroup. With systemd-docker on start it is possible for processes to escape the cgroup if they are launched fast enough. This is a current bug in systemd-docker.

ap4y commented

I checked systemd-cgls and I don't see anything suspicious. Container that I tested is running ruby application server with forking (unicorn) and all of it's subprocess belong to the service cgroup.

  │ └─carnival-web-api@staging.service
  │   ├─26256 /opt/bin/systemd-docker run --rm --name carnival-web-api -v /var/log/carnival-web-api:/carnival-web-api/log --env-file /etc/environment -E staging
  │   ├─26271 unicorn_rails master -c config/unicorn.rb -E staging                                                
  │   ├─26295 unicorn_rails worker[0] -c config/unicorn.rb -E staging                                             
  │   ├─26298 unicorn_rails worker[1] -c config/unicorn.rb -E staging                                             
  │   ├─26300 unicorn_rails worker[2] -c config/unicorn.rb -E staging                                             
  │   ├─26304 unicorn_rails worker[3] -c config/unicorn.rb -E staging                                             
  │   ├─26307 unicorn_rails worker[4] -c config/unicorn.rb -E staging                                             
  │   ├─26310 unicorn_rails worker[5] -c config/unicorn.rb -E staging                                             
  │   ├─26313 unicorn_rails worker[6] -c config/unicorn.rb -E staging                                             
  │   ├─26316 unicorn_rails worker[7] -c config/unicorn.rb -E staging                                             
  │   ├─26318 unicorn_rails worker[8] -c config/unicorn.rb -E staging                                             
  │   ├─26322 unicorn_rails worker[9] -c config/unicorn.rb -E staging                                             
  │   ├─26324 unicorn_rails worker[10] -c config/unicorn.rb -E staging                                            
  │   └─26327 unicorn_rails worker[11] -c config/unicorn.rb -E staging 
ap4y commented

I checked systemd-cgls and docker ps right before this issue happens. And I have empty cgroup:

  │ └─carnival-web-api@staging.service

While docker ps says container is still running.

ap4y commented

Tested against alpha version of CoreOS, seems like it works fine. Thanks for the help.