canonical/microk8s

Upgrading calico to 3.26

Opened this issue · 2 comments

Summary

A bug is found in Calico 3.25 and is believed to be fixed in 3.26.
Since microk8s 1.28/1.29/1.30 are still using calico 3.25, a user hit this bug after running microk8s 1.28 for 120 days.

What Should Happen Instead?

Microk8s 1.28 should bundle with calico 3.26.
Also as suggested in calico page, microk8s 1.29 should bundle with calico 3.27 [1], and microk8s 1.30 should bundle with calico 3.28 [2].

[1] https://docs.tigera.io/calico/3.27/getting-started/kubernetes/requirements
[2] https://docs.tigera.io/calico/latest/getting-started/kubernetes/requirements

Reproduction Steps

  1. Deploy microk8s
  2. Wait... It seems the issue happens after token expires. A user reported this issue happened after running for 120 days.

Introspection Report

Can you suggest a fix?

Upgrade calico to 3.26.5.

Are you interested in contributing with a fix?

Update /var/snap/microk8s/current/args/cni-network/cni.yaml file, change calico image version from 3.25.1 to 3.26.5, and add a new SA/role/rolebinding:

---
# Source: calico/templates/calico-node.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-cni-plugin
  namespace: kube-system
---
# CNI cluster role 
# Source: calico/templates/calico-node-rbac.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-cni-plugin
rules:
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - pods/status
    verbs:
      - patch
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - blockaffinities
      - ipamblocks
      - ipamhandles
      - clusterinformations
      - ippools
      - ipreservations
      - ipamconfigs
    verbs:
      - get
      - list
      - create
      - update
      - delete
---
# Source: calico/templates/calico-node-rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: calico-cni-plugin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-cni-plugin
subjects:
- kind: ServiceAccount
  name: calico-cni-plugin
  namespace: kube-system

Update an existing clusterrole

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-node
rules:
  # Used for creating service account tokens to be used by the CNI plugin
  - apiGroups: [""]
    resources:
      - serviceaccounts/token
    resourceNames:
      - calico-cni-plugin <- update from calico-node
    verbs:
      - create

Apply the yaml file
microk8s kubectl apply -f /var/snap/microk8s/current/args/cni-network/cni.yaml

As per internal discussion, this change will happen in version 1.32 as per the PR #4638

Is there a quick way to trigger this bug in order to verify if upgrading the Calico version and updating permissions effectively resolves the issue?