Easily deploy a high-availability RKE2 Kubernetes cluster on OpenStack providers (e.g. Infomaniak, OVH, etc.). This project aims at offering a simple and stable distribution rather than supporting all configuration possibilities.
Inspired and reworked from remche/terraform-openstack-rke2 to add an easier interface, high-availability, load-balancing and sensible defaults for running production workload.
- RKE2 Kubernetes distribution : lightweight, stable, simple and secure
- persisted
/var/lib/rancher/rke2
for single server durability - configure Openstack Swift or S3-like backend for automated etcd snapshots
- smooth updates & agent nodes autoremoval with pod draining
- bundled with Openstack Cloud Controller and Cinder CSI
- Cilium networking (network policy support and no kube-proxy)
- highly-available through load balancers
- out of the box support for volume snapshot and Velero
Component | Version |
---|---|
OpenStack | 2023.1 Antelope (verified), maybe older version are supported too |
RKE2 | v1.26.6+rke2r1 |
OpenStack Cloud Controller | v1.27.1 |
OpenStack Cinder | v1.27.1 |
Velero | v2.32.6 |
git clone git@github.com:zifeo/terraform-openstack-rke2.git && cd terraform-openstack-rke2/examples/single-server
cat <<EOF > terraform.tfvars
project=PCP-XXXXXXXX
username=PCU-XXXXXXXX
password=XXXXXXXX
EOF
terraform init
terraform apply # approx 2-3 mins
kubectl --kubeconfig single-server.rke2.yaml get nodes
# NAME STATUS ROLES AGE VERSION
# k8s-pool-a-1 Ready <none> 119s v1.21.5+rke2r2
# k8s-server-1 Ready control-plane,etcd,master 2m22s v1.21.5+rke2r2
# on upgrade, process node pool by node pool
terraform apply -target='module.rke2.module.servers["server-a"]'
# for servers, apply on the majority of nodes, then for the remaining ones
# this ensures the load balancer routes are updated as well
terraform apply -target='module.rke2.openstack_lb_members_v2.k8s'
See examples for more options or this article for a step-by-step tutorial.
Note: it requires rsync and
yq to generate remote kubeconfig file. You
can disable this behavior by setting ff_write_kubeconfig=false
and fetch
yourself /etc/rancher/rke2/rke2.yaml
on server nodes.
A stable, performant and fully equipped Kubernetes cluster in Switzerland for as little as CHF 26.90/month (at the time of writing):
- load-balancer with floating IP (perfect under Cloudflare proxy)
- 1 server 2cpu/4Go (= master)
- 1 agent 2cpu/4Go (= worker)
Flavour | CHF/month |
---|---|
2×5.88 (instances) + 0.09×2×(4+6) (block storage) + 3.34 (IP) + 10 (load-balancer) | 26.90 |
single 2cpu/4go server with 1x4cpu/16Go worker | ~37.— |
3x2cpu/4go HA servers with 1x4cpu/16Go worker | ~50.— |
3x2cpu/4go HA servers with 3x4cpu/16Go workers | ~85.— |
See their technical documentation and pricing.
# alias already set on the nodes
crictl
kubectl (server only)
# logs
sudo systemctl status rke2-server
journalctl -f -u rke2-server
sudo systemctl status rke2-agent.service
journalctl -f -u rke2-agent
less /var/lib/rancher/rke2/agent/logs/kubelet.log
less /var/lib/rancher/rke2/agent/containerd/containerd.log
less /var/log/cloud-init-output.log
# restore s3 snapshot (see restore_cmd output of the terraform module)
sudo systemctl stop rke2-server
sudo rke2 server --cluster-reset --etcd-s3 --etcd-s3-bucket=BUCKET_NAME --etcd-s3-access-key=ACCESS_KEY --etcd-s3-secret-key=SECRET_KEY --cluster-reset-restore-path=SNAPSHOT_PATH
sudo systemctl start rke2-server
# remove db on other server nodes
sudo systemctl stop rke2-server
sudo rm -rf /var/lib/rancher/rke2/server/db
sudo systemctl start rke2-server
# reboot all nodes one by one to make sure all is stable
sudo reboot
# check san
openssl s_client -connect 192.168.42.3:10250 </dev/null 2>/dev/null | openssl x509 -inform pem -text
# defrag etcd
kubectl -n kube-system exec $(kubectl -n kube-system get pod -l component=etcd --no-headers -o custom-columns=NAME:.metadata.name | head -1) -- sh -c "ETCDCTL_ENDPOINTS='https://127.0.0.1:2379' ETCDCTL_CACERT='/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/rke2/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/rke2/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl defrag --cluster"
# increase volume size
# shutdown instance
# detach volumne
# expand volume
# recreate node
terraform apply -target='module.rke2.module.servers["server"]' -replace='module.rke2.module.servers["server"].openstack_compute_instance_v2.instance[0]'