projectcalico/canal

Second node using docker0 instead of canal

mellotron opened this issue · 3 comments

I've set up a new Kubernetes cluster with kubeadm on two bare-metal nodes.

The LAN that they sit on is 192.168.1.x/24.

I initialized the first node with:

$ sudo kubeadm init --pod-network-cidr 10.244.0.0/16

And set up Canal with the steps on:

And joined node2 to the cluster with kubeadm

However, when I join the second node, and schedule pods on it, it's using the docker0 interface - node1 is doing the right thing with canal:

NAMESPACE    NAME                  READY     STATUS    RESTARTS   AGE       IP           NODE
kube-system   canal-6mtm6   3/3       Running   0         6m        192.168.1.5   node1
kube-system   canal-n5db6   3/3       Running   0         5m        192.168.1.6   node2
kube-system   etcd-node1   1/1       Running   0         7m        192.168.1.5   node1
kube-system   heapster-1428305041-hvbp0   1/1       Running   0         3m        172.17.0.3   node2
kube-system   kube-apiserver-node1   1/1       Running   0         7m        192.168.1.5   node1
kube-system   kube-controller-manager-node1   1/1       Running   0         7m        192.168.1.5   node1
kube-system   kube-dns-3913472980-c0cst   3/3       Running   0         7m        10.244.0.7   node1
kube-system   kube-proxy-7s3h5   1/1       Running   0         5m        192.168.1.6   node2
kube-system   kube-proxy-ft2p6   1/1       Running   0         7m        192.168.1.5   node1
kube-system   kube-scheduler-node1   1/1       Running   0         7m        192.168.1.5   node1
kube-system   monitoring-grafana-3975459543-r69tt   1/1       Running   0         3m        172.17.0.2   node2
kube-system   monitoring-influxdb-3480804314-9708z   1/1       Running   0         3m        10.244.0.8   node1

On node1, I see:

3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether ... brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether ... brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::.../64 scope link 
       valid_lft forever preferred_lft forever
9: cali70daf1af12c@if3:
   ...

While on node2, there's:

3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether ... brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::.../64 scope link 
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether ... brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::.../64 scope link 
       valid_lft forever preferred_lft forever
5: cni0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether ... brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.1/24 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::.../64 scope link 
       valid_lft forever preferred_lft forever
41: veth4e88edb@if40: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default
    ...

However, the config looks good on node2:

node2 $ cat /etc/cni/net.d/10-calico.conf 
{
    "name": "k8s-pod-network",
    "type": "calico",
    "log_level": "info",
    "datastore_type": "kubernetes",
    "hostname": "node2",
    "ipam": {
        "type": "host-local",
        "subnet": "usePodCidr"
    },
    "policy": {
        "type": "k8s",
        "k8s_auth_token": "..."
    },
    "kubernetes": {
        "k8s_api_root": "https://10.96.0.1:443",
        "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
    }
}

Attempted removing docker0 and redeploying something on docker0, got:

Error syncing pod, skipping: failed to "CreatePodSandbox" for "monitoring-influxdb-3480804314-msh2t_kube-system(4e2d8a46-2fe7-11e7-882e-00232478ac64)" with CreatePodSandboxError: "CreatePodSandbox for pod \"monitoring-influxdb-3480804314-msh2t_kube-system(4e2d8a46-2fe7-11e7-882e-00232478ac64)\" failed: rpc error: code = 2 desc = failed to start sandbox container for pod \"monitoring-influxdb-3480804314-msh2t\": Error response from daemon: {\"message\":\"failed to create endpoint k8s_POD_monitoring-influxdb-3480804314-msh2t_kube-system_4e2d8a46-2fe7-11e7-882e-00232478ac64_8 on network bridge: adding interface vethca446a1 to bridge docker0 failed: could not find bridge docker0: route ip+net: no such network interface\"}"

@mellotron can you verify in the logs that the Pod monitoring-influxdb-3480804314-msh2t was launched using the Calico CNI plugin?

It is probably worth verifying that the kubelet on node2 is configured with --network-plugin=cni and not --network-plugin=kubenet.

The Calico plugin should be used, and it should never try to connect a container to any bridge, so that's suspicious to me. Sounds like the kubelet might be trying to launch that Pod with another CNI plugin for some reason.

This was my fault, I removed KUBELET_NETWORK_ARGS for another bug and kubeadm reset doesn't remove the systemd kubelet files so kubeadm init doesn't write a new one over the changed one. Sorry about that.