kubernetes-retired/kubeadm-dind-cluster

Insecure registries not working with latest fixed 1.10

Philippe-Collignon opened this issue · 5 comments

It as working before. Now, with last update, startup just fails :

* + Setting up insecure-registries on kube-master 
Job for docker.service canceled. 

My insecure registries variable is :
DIND_INSECURE_REGISTRIES="[ \"localregistry:5000\", \"0.0.0.0/0\"]"

It fails at docker exec ${container_id} systemctl restart docker
in function dind::custom-docker-opts

On the node, the /etc/docker/daemon.json is right.

If I try to run systemctl restart dockermanually I get this logs :

#  systemctl daemon-reload
# systemctl restart docker
Job for docker.service canceled.
# service docker status
WARNING: terminal is not fully functional
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: https://docs.docker.com

Dec 04 15:36:15 kube-master systemd[1]: Starting Docker Application Container Engine...
Dec 04 15:36:15 kube-master rundocker[60]: Trying to load overlay module (this may fail)
Dec 04 15:36:15 kube-master systemd[1]: Stopped Docker Application Container Engine.
Dec 04 15:37:52 kube-master systemd[1]: Starting Docker Application Container Engine...
Dec 04 15:37:52 kube-master rundocker[148]: Trying to load overlay module (this may fail)
Dec 04 15:37:52 kube-master rundocker[148]: /dev/sda1 /var/lib/kubelet/pods ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
Dec 04 15:37:52 kube-master rundocker[148]: /dev/sda1 /var/log/pods ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
Dec 04 15:37:52 kube-master systemd[1]: Stopped Docker Application Container Engine.

If I start the cluster without insecure-regitry and then I ssh to the nodes, set the insecure-registry config in daemon.json and restart manually .. it works !

same effect for 1.12 when using DIND_REGISTRY_MIRROR and supplying DIND_CA_CERT_URL

To get around this you can add || true to the end of line 2295 in dind-cluster-v1.10.sh
docker exec ${container_id} systemctl restart docker || true

I think the issue is that when docker.service starts, it also tries to start containerd.service. containerd.service fails with the following message.

Dec 07 20:48:06 kube-master containerd[114]: containerd: creating temp mount location: mkdir /var/lib/containerd: file exists
Dec 07 20:48:06 kube-master systemd[1]: containerd.service: Main process exited, code=exited, status=1/FAILURE
Dec 07 20:48:06 kube-master systemd[1]: containerd.service: Unit entered failed state.
Dec 07 20:48:06 kube-master systemd[1]: containerd.service: Failed with result 'exit-code'.
Dec 07 20:48:06 kube-master systemd[1]: Stopped Docker Application Container Engine.

This is due to the symlink created in the Dockerfile ln -s /dind/containerd /var/lib/containerd, but /dind/ is empty at this point.

Later on, /usr/local/bin/wrapkubeadm is executed in the container. This scripts calls mkdir -p /dind/containerd. With /dind/ no longer empty, docker.service and containerd.service start successfully.