aws/amazon-vpc-cni-k8s

What is the purpose of the nodeagent sidecar if the network policy controller is disabled?

yurrriq opened this issue · 6 comments

What happened:

Given the following config...

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: amazon-vpc-cni
  namespace: kube-system
data:
  enable-network-policy-controller: "true"

... the sidecar is still installed.

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: aws-node
  namespace: kube-system
spec:
  template:
    spec:
      containers:
        - args:
            - --enable-network-policy=false
          name: aws-eks-nodeagent

Environment:

  • Kubernetes version (use kubectl version): v1.24.16-eks-2d98532
  • CNI Version: v1.14.1-eksbuild.1
  • OS (e.g: cat /etc/os-release): Ubuntu 20.04.6 LTS
  • Kernel (e.g. uname -a): 5.15.0-1044-aws

@yurrriq there is no easy way to define the daemonset such that the node agent container can be enabled with a single modification. Excluding the container definition from the daemonset and having it only be rendered when a helm chart value is set makes it difficult for everyone that self-manages the addon and relies on the manifest.

The goal is for the node agent container is to consume effectively zero resources (CPU and memory) when enable-network-policy is false, so what I see as the real problem here is the container doing "too much" when network policy is disabled.

One option is for that container to exit when enable-network-policy is false, and another is for its runtime to be more efficient, like not serving metrics.

Thanks for the response. Might it make sense to provide different static manifests, maybe one with the nodeagent and one without? I recognize that's a slippery slope and potentially nontrivial maintenance burden... I have no evidence (either way, yet) that the sidecar with --enable-network-policy=false is doing "too much". Assuming it doesn't (will check shortly), I'm not too bothered. If we JSONPatch to remove it ourselves will the CNI get angry?

Update: I can confirm barely detectable CPU usage and flat ~15Mi memory, so negligible indeed.

We have considered multiple static manifests in the past, like for IPv6, Custom Networking, and IPv4 Prefix Delegation, but yeah it becomes a slippery maintenance slope and can add confusion. The CNI won't care if you patch out the node agent container, so that's a great option. As for the node agent process itself, yeah the only thing I see it doing that it probably shouldn't is serving metrics and health when it is disabled.

Makes sense. Thanks!

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Excluding the container definition from the daemonset and having it only be rendered when a helm chart value is set makes it difficult for everyone that self-manages the addon and relies on the manifest.

@jdn5126 What would be the problem with a simple {{- if .Values.enableNetworkPolicy }}...{{- end }} and a default value of true?