cybozu-go/coil

Pods cannot communicate directly

tflabs-nl opened this issue · 16 comments

Describe the bug
After creating a kubernetes cluster with the default service IP's and installing Coil as a CNI (no other CNI's) pods are not able to communicate directly.

Environments

  • Kubernetes version:
    Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
  • Kernel version: Linux HostnameHere 5.4.0-96-generic #109-Ubuntu SMP Wed Jan 12 16:49:16 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • OS: Ubuntu 20.04.03 LTS (focal)

To Reproduce
Steps to reproduce the behavior:

  1. Configure a default address pool
    `
    spec:
    blockSizeBits: 5
    subnets:
    • ipv4: 10.100.0.0/16
      `
  2. Configure BIRD (setup BGP sessions and import/export from Coil routing table as described in the example)
  3. Apply the default nginx-ingress controller with some minor changes (type: LoadBalancer, loadBalancerIP: public_bird_ip, externalTrafficPolicy: Cluster)
  4. Create a second address pool containing the exact same public_bird_ip
  5. Assign second address pool to namespace B
  6. Deploy Egress resource called "nat" in namespace B
  7. Create Deployment / Service / Ingress in default namespace and add annotations for namespace B: nat in pod template(s)
  8. Now look at the cluster IPs of two different pods (both in same namespace, but doesn't seem to matter)
  9. curl localhost in both pods to verify that nginx inside the pod responds to requests (confirmed, works)
  10. curl the other pods ClusterIP --> does not work --> NGINX results in 502 (bad gateway) errors and no traffic is shown in the access.log from nginx inside the pod.

Expected behavior
Being able to access other pods in same (or other) namespace.

Additional context
It doesn't matter if both pods are scheduled on the same node, traceroute makes it seem like traffic cannot be delivered to the pod.
Example traceroute (simplified):


  1. IP address of node the pod is running on
  2. IP address of node the destination pod is running on




when CURL'ing the services' ClusterIP from the node itself, or even from another node, everything works as expected.

Is this a mis-configuration?

@ysksuzuki is there any way I can contact you so I can provide access to the cluster and yamls for easy debugging for you?

Hi, thank you for reporting the issue. Could you share the manifests you applied to your cluster?

Hi, thank you for your fast reply.

[ see my next reply for files ]

I only included the changes I did before building and the resulting yaml from the Coil build process.
Also, I included my BIRD config file so you can see the import/export rules.

If you need anything else please let me know!

Please write your files directly here.

Ahh, will do. I can't upload yaml files, so I will upload the zip instead here...

As an organizational policy, I am not allowed to open attachments, so please paste the yaml contents directly.

the generated coil.yaml is too big to paste, so I need to skip that one.

default address pool:

apiVersion: coil.cybozu.com/v2
kind: AddressPool
metadata:
  name: default
spec:
  blockSizeBits: 5
  subnets:
  - ipv4: 10.100.0.0/16

kustomization used to generate coil yaml:

images:
- name: coil
  newTag: 2.0.14
  newName: ghcr.io/cybozu-go/coil

resources:
- config/default
# If you are using CKE (github.com/cybozu-go/cke) and wwant to use
# its webhook installation feature, comment the above line and
# uncomment the below line.
#- config/cke

# If you want to enable coil-router, uncomment the following line.
# Note that coil-router can work only for clusters where all the
# nodes are in a flat L2 network.
- config/pod/coil-router.yaml

# If your cluster has enabled PodSecurityPolicy, uncomment the
# following line.
#- config/default/pod_security_policy.yaml

patchesStrategicMerge:
# Uncomment the following if you want to run Coil with Calico network policy.
#- config/pod/compat_calico.yaml

# Edit netconf.json to customize CNI configurations
configMapGenerator:
- name: coil-config
  namespace: system
  files:
  - cni_netconf=./netconf.json

# Adds namespace to all resources.
namespace: kube-system

# Labels to add to all resources and selectors.
commonLabels:
  app.kubernetes.io/name: coil

netconf.json:

{
  "cniVersion": "0.4.0",
  "name": "k8s-pod-network",
  "plugins": [
    {
      "type": "coil",
      "socket": "/run/coild.sock"
    },
    {
      "type": "bandwidth",
      "capabilities": {
        "bandwidth": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

default egress:

apiVersion: coil.cybozu.com/v2
kind: Egress
metadata:
  namespace: default
  name: egress
spec:
  replicas: 1
  destinations:
  - 10.100.0.0/16

Create webserver namespace, no annotiations;
Then:

Create public facing IP pool:

apiVersion: coil.cybozu.com/v2
kind: AddressPool
metadata:
  name: webserver
spec:
  blockSizeBits: 0
  subnets:
  - ipv4: 185.222.22.22/32

Create webserver-internet namespace with annotation for created IP pool:

apiVersion: v1
kind: Namespace
metadata:
  name: webserver-internet
  annotations:
    coil.cybozu.com/pool: webserver

Create webserver-internet egress:

apiVersion: coil.cybozu.com/v2
kind: Egress
metadata:
  namespace: webserver-internet
  name: nat
spec:
  replicas: 1
  destinations:
  - 0.0.0.0/0

Could you tell me what you want to do? You created a Pod which has egress.coil.cybozu.com/webserver-internet: nat annotation but it couldn't access the internet?

I created a deployment with multiple replica's in the default namespace. I expected these pods to be able to curl/ping eachother, but that doesn't seem to work. I did an apt update && apt install apache2 iputils-ping to test the curl'ing.

Pod 1 got clusterIP address: 10.100.6.20
Pod 2 got clusterIP address: 10.100.6.2

Both run on the same node, in the same namespace.

So inter-pod communication does not seem to work, while I expected it to do so.

yaml here:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-debug-21-10
spec:
  selector:
    matchLabels:
      management: management
  replicas: 3
  strategy:
    type: RollingUpdate
  template:
    metadata:
      annotations:
        egress.coil.cybozu.com/webserver-internet: nat   
        egress.coil.cybozu.com/default: egress   
      labels:
        management: management
    spec:
      containers:
        - name: debugging
          image: 'weibeld/ubuntu-networking' #ubuntu:21.10
          command: [ "/bin/bash", "-c", "--" ]
          args: ["while true; do sleep 30; done;"]    
      dnsPolicy: None
      dnsConfig:
        nameservers:
          - 1.1.1.1
          - 8.8.8.8      

Can those Pods communicate each other without egress.coil.cybozu.com/webserver-internet: nat and egress.coil.cybozu.com/default: egress annotations? Why the Egress in default namespace is needed?

image
yes, that works!

I thought the egress in the default namespace was needed to make sure 10.100.0.0/16 is not routed outside of the cluster, as they would otherwise only have a 0.0.0.0/0 route via webserver-internet: nat?

image

Only including the egress.coil.cybozu.com/webserver-internet: nat also works

I thought the egress in the default namespace was needed to make sure 10.100.0.0/16 is not routed outside of the cluster, as they would otherwise only have a 0.0.0.0/0 route via webserver-internet: nat?

Do you mean that you created the Egress in default namespace to avoid packets destined to 10.100.0.0/16 from being routed outside of the cluster? If so you don't need to do that. Coil allocates address blocks from the address pool 10.100.0.0/16 and publish the routing entry to each cluster node, and the cluster nodes aware of the Pod CIDR.

I indeed tried to avoid internal packets being forced over the internet. This makes sense! Thank you!

Also, do you have a donation link?