An Ansible role to provide an automated K3s Lightweight Distribution of Kubernetes initial deployment. The goal is to have Anisble build just enough Kubernetes on a cluster node to get ArgoCD running. Anisble will then be used to render various application manifest files that ArgoCD will deploy.
Once the initial deployment is successful you do not need Ansible to maintain the cluster applications - ArgoCD will use an "App of Apps" pattern to handle this, along with Renovate to maintain and update application versions.
The following enhancements are part of this Ansible role:
- ZFS Integration via ZFS ZVOL for K3s. Unfortunately K3s uses an integrated
containerd
with reduced functionality -- lacks ZFS snapshot support.- Instead a ZFS ZVOL will be created with XFS filesystem to allow K3s to use its native overlay snapshot filesystem.
- The ZFS native encryption can be enabled on the ZVOL.
- non-root user account for Kubernetes, passwordless access to
kubectl
by default. - Centralized cluster system logging via rsyslog with real-time viewing with lnav utility.
- Helm Client for installing applications in Kubernetes.
- ArgoCD will deploy all applications used here. They are added to Git repository for ArgoCD and every few minutes it confirms that applications are deployed as configured.
- Non-compliant changes are automatically detected and optionally rolled back automatically.
- Renovate runs as nightly (or on demand) job scanning the Git repository to detect if application upgrades are available.
- If an upgrade is detected, Renovate will generate a Pull Request (PR) in the Git repository where you can review and approve the upgrade.
- This process updates the deployment manifest files which ArgoCD detects and will deploy the upgraded application for you.
- ArgoCD and Renovate work together to help keep your application versions current and prevent configuration drift.
- Cert-manager with Let's Encrypt wildcard certificates for your domains generated against Let's Encrypt staging or prod (Cloudflare DNS validator).
- K3s System Upgrade Controller to perform rolling upgrades to newer Kubernetes releases.
- Renovate will create the Pull Request for your review and approval.
- Once approved, within minutes the controller will start to upgrade the master nodes one by one and then the worker nodes.
Optionally Installed:
- kube-vip for Kubernetes API Load Balancer
- kube-vip-cloud-provider Load Balancer to replace K3s Klipper Load Balancer for ingress traffic.
- Sealed Secrets for true encrypted secrets safe for public git repositories (still recommend using private repository)
- Longhorn distributed Persistent Volumes as default storage class
- democratic-csi to provide Persistent Volumes storage via iSCSI and NFS from TrueNAS
- Kube-Prometheus Stack collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules to provide easy to operate end-to-end Kubernetes cluster monitoring
- Traefik HA Load Balanced ingress deployed as a DaemonSet.
- IngressRoutes for the following will be generated and deployed:
- Traefik Dashboard
- ArgoCD Dashboard
- Longhorn Dashboard
- Prometheus Dashboard
- AlertManager Dashboard
- Grafana Dashboards
- IngressRoutes for the following will be generated and deployed:
Home Cluster Compute Hardware Summary:
Device | Count | Cores / Threads | OS Disk Size | Data Disk Size | RAM | Purpose |
---|---|---|---|---|---|---|
HP T740 Thin PC | 3 | (Ryzen V1756B) 4 / 8 | 118GiB ZFS Mirror | 800Gib Rook-Ceph | 64GB | Kubernetes Master / Ceph Storage |
Minisform UM560 | 2 | (Ryzen 5 5625U) 6 / 12 | 120GiB ZFS Mirror | 2TiB Rook-Ceph | 64GB | Kubernetes Worker / Ceph Storage |
Custom Build (Fractal Design Node 804 with ASRock B660 Steel Legend) | 1 | (Intel i5-13500) 6 / 20 | 14TB ZFS zRaid | 2 TiB Rook-Ceph | 64GB | Kubernetes Worker / Ceph / Intel iGPU |
Custom Build (Fractal Design Node 804 with ASUS Prime X570-Pro) | 1 | (Ryzen 7 3700X) 8 / 16 | 1 TiB ZFS Mirror | 2 TiB Rook-Ceph | 128GB | Kubernetes Worker / Ceph / NVidia GPU / Desktop |
- All devices have at least 2.5GbE networking (some are 10GbE) to UniFi USW Enterprise 8 Port Switch with dual 10GbE uplinks to main network.
Ceph Storage Cluster Summary:
30 Day Service Availability:
Service | 30 Day Stats | Service | 30 Day Stats | Service | 30 Day Stats | ||
---|---|---|---|---|---|---|---|
Home Cluster Network Summary:
Device | Count | Purpose | Specifications | Description |
---|---|---|---|---|
Firewall | 1 | Router | Intel i5-5200U CPU, 8GB RAM, ZFS mirror storage, 4x 1GbE RJ45 ports | Primary network pfSense firewall, router, DNS, Proxy |
Switch | 1 | Backbone | MikroTik CloudSwitch CRS309-1G-8S+IN. 8x SPF+ 10GbE ports | Primary Homelab Switch. All other switches and access points are downstream |
Switch | 1 | Cluster Switch | UniFi USW Enterprise 8 Port Poe Switch. 8x 2.5GbE RJ45 ports and 2x SPF+ 10GbE ports | Dedicated Kubernetes cluster switch with dual 10GbE uplinks to Backbone switch |
UPS | 3 | Backup Power Supply | Trip-Lite Smart 1500 LCDt UPS unit | Provide short term backup power and clean stable electricity to all devices |
KVM | 2 | Keyboard and Video Switch | 4 HDMI in, 4 USB In, 4 USB Out | Provides console access with keyboard to cluster devices |
- You should read it. :)
- A tweaked multi-node Kubernetes cluster based on K3s (no docker used)
- You will need to setup an Ansible inventory file in a defined way
- You will need to create a dedicated repository for ArgoCD, ideally a private GitHub repository (free)
- ArgoCD will require Ansible secrets set for repository URL, Access Token, etc.
- Ansible will render all initial application manifest files and commit them to Git repository
- ArgoCD will see remaining missing applications and deploy them as defined
- Renovate will monitor deployed application manifests and provide update notifications via Pull Request process
- Let's Encrypt configuration requires you to define your challenge credentials and list domains for certificate generation
- Kube-vip Load Balancer section will require you to specify a range of IP addresses available for use and a VIP address for the API Load Balancer
- Longhorn Distributed storage (if enabled) is intended to be the default storage class, the
local-path
StorageClass is not installed - Sealed Secrets can be used to provide truly encrypted secrets considered safe to be committed to public git repositories
- Ubuntu 22.04.x LTS
- Based ZFS on Root installation
- K3s v1.25.x - v1.27.x
- apparmor, apparmor-utils required for K3s Containerd to load profiles
- lnav for view centralized cluster system logging
- python3-pip (required for Ansible managed nodes)
- pip packages - OpenShift, pyyaml, kubernetes (required for Ansible to execute K8s module)
- k3s (Runs official script https://get.k3s.io)
- helm, helm diff, apt-transport-https (required for helm client install)
- open-iscsi, lsscsi, sg3-utils, multipath-tools, scsitools (required by democratic-csi and by Longhorn)
- xfsprogs (required for ZFS ZVOL used for K3s installation)
- I provide a lot of documentation notes below for my own use. If you find it overwhelming, keep in mind most of it you do not need.
- Towards the bottom is a section which shows how to use Ansible to run this in stages (step by step) to built it up in layers using
tags
. - I no longer use Longhorn for in-cluster storage. I currently use Rook-Ceph instead, however that's outside the scope of this project.
Each of these links provide useful documentation details:
- Review Linux OS Settings
- Review Centralized Cluster System Logs Settings
- Review K3S Configuration Settings
- Review ArgoCD Configuration Settings
- Review Renovate Configuration Settings
- Review Sealed Secrets Configuration Settings
- Review System Upgrade Controller Configuration Settings
- Review CertManager Configuration
- Review Let's Encrypt Configuration
- Review Kube-vip API Load Balancer Settings
- Review Traefik and Dashboard Settings
- Review Longhorn Distributed Storage Settings
- Review democratic-csi for TrueNAS Settings
- Review Kube Prometheus Stack Settings
Define a group for this playbook to use in your inventory, I like to use YAML format:
k3s_control:
hosts:
k3s01.example.com: # Control Node / Master #1
k3s_pool: "rpool"
k3s_vol_size: "50G"
k3s02.example.com: # Control Node / Master #2
k3s_pool: "rpool"
k3s_vol_size: "50G"
k3s03.example.com: # Control Node / Master #3
k3s_pool: "rpool"
k3s_vol_size: "50G"
vars: # Applies to all control nodes
longhorn_zfs_pool: "tank"
longhorn_vol_size: "10G"
vip_endpoint_ip: "192.168.10.220"
vip_lb_ip_range: "cidr-global: 192.168.10.221/30" # 4 Addresses pool
traefik_lb_ip: "192.168.10.221" # must be within cidr ip_range
k3s_labels:
- "k3s-upgrade=true"
k3s_workers:
hosts:
k3s04.example.com: # Worker #1
k3s05.example.com: # Worker #2 (add more if needed)
vars: # Applies to all worker nodes
k3s_pool: "rpool"
k3s_vol_size: "30G"
k3s_labels:
- "kubernetes.io/role=worker"
- "node-type=worker"
- "k3s-upgrade=true"
k3s: # Group name for all nodes
children:
k3s_control:
k3s_workers:
vars:
# Install versions are optional, lets you pin newer versions than defaults
k3s_install_version: "v1.23.5+k3s1"
argocd_install_version: "4.10.5"
renovate_install_version: "32.152.0"
cert_manager_install_version: "v1.8.2"
sealed_secret_install_version: "v2.6.0"
system_upgrade_controller_install_version: "v0.9.1"
kube_vip_install_version: "v0.5.0"
kube_vip_cloud_provider_install_version: "v0.0.3"
traefik_install_version: "v10.22.0"
longhorn_install_version: "v1.3.0"
democratic_csi_install_version: "0.13.4"
prometheus_op_install_version: "39.5.0"
prometheus_op_crd_version: "v0.58.0"
zfs_exporter_install_version: "v2.2.5"
#[ Unique Per Cluster Settings ]############################################
democratic_csi_parent_dataset: "main/k8s"
k3s_cluster_ingress_name: "k3s-test.{{ansible_domain}}"
argocd_repo_url: "https://github.com/<USERNAME>/<REPO-NAME>"
# Longhorn does support S3 or NFS backup targets. Only NFS supported here.
longhorn_backup_target: "nfs://192.168.10.102:/mnt/main/backups/longhorn-test"
K3S_TOKEN: 'secret_here' # Set to any value you like
- This inventory file divides hosts into Control nodes and Worker nodes:
-
Easily defines High Availability (HA) distributed etcd configuration.
-
The cluster will work fine with just a single node but for HA you should have 3 (or even 5) control nodes:
master nodes must maintain can lose comment 1 1 0 Loss of 1 is headless cluster 2 2 0 Loss of 1 is headless cluster 3 2 1 Allows loss of 1 master only 4 3 1 No advantage over using 3 5 3 2 Allows loss of 2 masters 6 4 2 No advantage over using 5 7 4 3 Allows loss of 3 masters -
Kubernetes uses the RAFT consensus algorithm for quorum for HA.
-
More then 7 master nodes will result in a overhead for determining cluster membership and quorum, it is not recommended. Depending on your needs, you typically end up with 3 or 5 master nodes for HA.
-
For simplicity I show the variables within the inventory file. You can place these in respective group vars and host vars files.
vip_endpoint_ip
specifies the IP address to be used for the Kubernetes API Load Balancer provided by Kube-vipvip_lb_ip_range
a CIDR expression which defines the IP address range kube-vip can use to provide IP addresses for LoadBalancer services.traefik_lb_ip
defines the IP address to be used for the Traefik ingress controller Load Balancer. It must be within the range defined byvip_lb_ip_range
CIDR.
NOTE: After using Longhorn for a while, I have decided not to use it. It has issues reclaiming disk space and does not seem appropriate for any applications with heavy disk write activity. The amount of space is not a concern, just the amount of disk writes. Works great for low write volume applications.
longhorn_zfs_pool
lets you define the ZFS pool to create Longhorn cluster storage with. It will use the ZFS poolrpool
if not defined. This can be host specific or group scoped.longhorn_vol_size
specifies how much storage space you wish to dedicate to Longhorn distributed storage. This can be host specific or group scoped.longhorn_backup_target
full NFS share URL path for Longhorn to make backups of the cluster storage volumes.
-
k3s_pool
lets you define the ZFS pool to be used for K3s installation and mounted on/var/lib/rancher
You don't have to create a new pool, just specify a valid existing pool to use. A ZVOL will be created within the pool specified here. -
k3s_vol_size
specifies the size of the ZVOL to create for the K3s Installation. 30G to 50G is a reasonable starting size. -
k3s_cluster_ingress_name
is the Fully Qualified Domain Name (FQDN) you plan to use for the cluster. This will point to the Traefik Ingress controller's Load Balancer IP Address.- If not provided it will default to
k3s
and the domain name of the Kubernetes Primary Master server... something likek3s.localdomain
ork3s.example.com
- All of the respective dashboards (Traefik, Longhorn, Prometheus, Grafana, etc) will be available from this FQDN.
- If not provided it will default to
-
k3s_cli_var
passes host specific variables to the K3s installation script. -
k3s_labels
can be used to set labels on the cluster nodes. This can be host specific or group scoped. For example, instead of worker nodes having a default role of<NONE>
, the followings gives them a more kubernetes like role name:vars: k3s_labels: - "kubernetes.io/role=worker" - "node-type=worker" - "k3s-upgrade=true"
-
K3S_TOKEN
is a secret required for nodes to be able to join the cluster. The value of the secret can be anything you like. The variable needs to be scoped to the installation group.- While it can be defined directly within the inventory file or group_var it better to create a variable named
K3S_TOKEN
in using Ansible's vault. - If you do not define this variable then the default
top_secret
which is lame will be used. - If you need inspiration for an easy to create a secret value:
$ date | md5sum 0097661c0c55ccc8921617e0997d2e73
- While it can be defined directly within the inventory file or group_var it better to create a variable named
argocd_repo_url
is a URL which points the the Git repository (private recommended) that ArgoCD will monitor. Do NOT put.git
at the end.democratic_csi_parent_dataset
(if Democratic-CSI is to be used) specifies the TrueNAS parent dataset for iSCSI and NFS Persistent Volume storage. If multiple clusters are setup against the same TrueNAS server then this value needs to be unique.k3s_cluster_ingress_name
is the Fully Qualified Domain Name the cluster will be known as. This is a DNS name that Cert-Manager will create Let's Encrypt Wildcard certificates for. Most dashboards will be based on this name.
The idea behind pinning specific versions of software is so that an installation done on Monday can be identical when installed on Tuesday or Friday, or sometime next month. Without pinning specific versions you have no way of knowing what random combination of versions you will get.
k3s_install_version
pins the K3s Release version.argocd_install_version
pings the ArgoCD Helm Release (not application version)renovate_install_version
pins the Renovate Helm Release (not application version)cert_manager_install_version
pins the Cert-manager Helm Releasesealed_secret_install_version
pins the Sealed Secrets Helm Releasesystem_upgrade_controller_install_version
pins the application Release versionkube_vip_install_version
pins the Application Container Tag Releasekube_vip_cloud_provider_install_version
pins the Application Container Tag Releasetraefik_install_version
pings the Traefik Helm Release version.longhorn_install_version
pins the Longhorn Helm Release version.democratic_csi_install_version
pins the Democratic CSI iSCSI and/or NFS Provisioner Helm Release version.prometheus_op_install_version
pins the Kube Prometheus Stack Helm Release versionprometheus_op_crd_version
pins the Prometheus Operator CRD Release that Kube Prometheus Stack requireszfs_exporter_install_version:
pins the ZFS Exporter used for ZFS File System Monitoring Release
Simple playbook I'm using for testing, named k3s-argocd.yml
:
- name: k3s Kubernetes Installation with ZFS & ArgoCD GitOPS
hosts: k3s
become: true
gather_facts: true
roles:
- role: k3s-argocd
The most basic way to deploy K3s Kubernetes with ContainerD:
ansible-playbook -i inventory.yml k3s-argocd.yml
To limit execution to a single machine:
ansible-playbook -i inventory.yml k3s-argocd.yml -l k3s01.example.com
Instead of running the entire playbook, you can run smaller logical steps using tags. Or use a tag to re-run a specific step you are troubleshooting.
ansible-playbook -i inventory.yml k3s-argocd.yml -l k3s01.example.com --tags="<tag_goes_here>"
The following tags are supported and should be used in this order:
config_rsyslog
prep_os
install_k3s
apply_labels
validate_k3s
install_helm_client
install_sealed_secrets
install_argocd
deploy_apps
config_le_certificates
The following tags are not run by default but can be used to install this additional software:
install_democratic_csi_iscsi
install_democratic_csi_nfs
install_prometheus_operator
Other handy tags for specific routines:
update_kubeseal_cli
- will just update thekubeseal
CLI to be version matched to the Sealed Secrets controller deployed.update_argocd_cli
- will just update theargocd
CLI to be version matched to ArgoCD controller deployed.update_argocd_files
- will process ArgoCD repository files likeinstall_argocd
does, but will not attempt to try and run Helm install on ArgoCD which would likely error out as it should not be managed by Helm anymore.update_zfs_exporter
- will just update thezfs_exporter
utility deployed to nodes for ZFS filesystem monitoring to the version specified in inventory variable.
A K3s cluster monitoring dashboard specific to this installation is available:
https://grafana.com/grafana/dashboards/16450
- This will be automatically installed as a configMap Dashboard for Grafana as part of the Kube-Prometheus-Stack procedure.
Additional Dashboards will also be deployed as ConfigMaps (modified from defaults to work with containerd and not docker):
- ArgoCD - Screenshot #2 - #3 - #4 - #5
- Cert-Manager
- Longhorn - Screenshot #2 - #3
- Traefik - Screenshot #2 - #3
- Several from dotdc modern dashboards
- Global View - Screenshot #2 - #3
- Namespaces
- Nodes - Screenshot#2 - #3 - #4 - #5 - - #6
- Pods
- API Server - Screenshot #2
- CoreDNS - Screenshot #2
- Several from Kubernetes-Mixin (Deployed by Kube-Prometheus-Stack)
The K3s System Upgrade Controller is deployed to the system-upgrade
namespace and system-upgrade
ArgoCD project. It is used to perform rolling upgrades to newer Kubernetes releases when available. See Configuration Settings for more details.