Project target
Target of this project is to check how (if) to switch cni operators without downtime. Inspired by https://cilium.io/blog/2020/10/06/skybet-cilium-migration/
Sources
https://github.com/cilium/cilium/pull/14192/files - switch of cilium renaming other cni's configs
k8snetworkplumbingwg/multus-cni#560 - another important pointer
Materials:
Possible replacement to multus - cni-genie: https://www.linkedin.com/pulse/multi-cni-containers-network-interfaces-kubernetes-gokul-chandra https://github.com/cni-genie/CNI-Genie It doesn't look like actively maintained and doesn't support cilium.
Procedure
Initial setup
- create kind cluster:
./10_start_cluster.sh
- Install calico & goldpinger:
./20_initial_state.sh
Monitoring network
Each node exposes goldpinger as a node port as a service on ports 30080, 31080, 32080, 33080. To check it open http://127.0.0.1:30080/
Migration
- Install multus with calico:
./30_install_multus.sh
- Install cilium:
./40_install_cilium.sh
- Enable both CNIs:
./50_calico_default.sh
- if calico-node is restarted it raises error messages:
Received route 10.245.2.0/24 with strange next-hop 10.245.2.28
- if calico-node is restarted it raises error messages:
- Switch to cilium as default:
./60_cilium_default.sh
- step not needed???? - Switch to cilium
./70_cilium.sh
- old pods can't communicate with new ones