📖 Overview
This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using the tools like Ansible, Terraform, Kubernetes, Flux, Renovate and GitHub Actions.
⛵ Kubernetes
There's an excellent template over at k8s-at-home/template-cluster-k3 if you wanted to try and follow along with some of the practices I use here.
Installation
My cluster is k3s provisioned overtop bare-metal Ubuntu 20.04 using the Ansible galaxy role ansible-role-k3s. This is a semi hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate server for (NFS) file storage.
Core Components
- projectcalico/calico: Internal Kubernetes networking plugin.
- rook/rook: Distributed block storage for peristent storage.
- mozilla/sops: Manages secrets for Kubernetes, Ansible and Terraform.
- kubernetes-sigs/external-dns: Automatically manages DNS records from my cluster in a cloud DNS provider.
- jetstack/cert-manager: Creates SSL certificates for services in my Kubernetes cluster.
- kubernetes/ingress-nginx: Ingress controller to expose HTTP traffic to pods over DNS.
GitOps
Flux watches my cluster folder (see Directories below) and makes the changes to my cluster based on the YAML manifests.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.
Directories
This Git repository contains the following directories (kustomizatons) under cluster.
📁 cluster # k8s cluster defined as code
├─📁 base # flux, gitops operator, loaded before everything
├─📁 crds # custom resources, loaded before 📁 core and 📁 apps
├─📁 charts # helm repos, loaded before 📁 core and 📁 apps
├─📁 config # cluster config, loaded before 📁 core and 📁 apps
├─📁 core # crucial apps, namespaced dir tree, loaded before 📁 apps
└─📁 apps # regular apps, namespaced dir tree, loaded last
Networking
Name | CIDR |
---|---|
Kubernetes Nodes | 192.168.42.0/24 |
Kubernetes external services (Calico w/ BGP) | 192.168.69.0/24 |
Kubernetes pods | 10.69.0.0/16 |
Kubernetes services | 10.96.0.0/16 |
- HAProxy configured on Opnsense for the Kubernetes Control Plane Load Balancer.
- Calico configured with
externalIPs
to expose Kubernetes services with their own IP over BGP which is configured on my router.
Persistent Volume Data Backup and Recovery
This is a hard topic to explain because there isn't a single great tool to work with rook-ceph. There's Velero, Benji, Gemini, and others but they all have different amount of issues or nuances which makes them unsable for me.
Currently I am leveraging Kasten K10 by Veeam which does a good job of snapshotting Ceph block volumes and the exports the data in the snapshot to durable storage (S3 / NFS).
🌐 DNS
Ingress Controller
Over WAN, I have port forwarded ports 80
and 443
to the load balancer IP of my ingress controller that's running in my Kubernetes cluster.
Cloudflare works as a proxy to hide my homes WAN IP and also as a firewall. When not on my home network, all the traffic coming into my ingress controller on port 80
and 443
comes from Cloudflare. In Opnsense
I block all IPs not originating from the Cloudflares list of IP ranges.
Internal DNS
k8s_gateway is deployed on Opnsense
. With this setup, k8s_gateway
has direct access to my clusters ingress records and serves DNS for them in my internal network. k8s_gateway
is only listening on 127.0.0.1
on port 53
.
For adblocking, I have AdGuard Home also deployed on Opnsense
which has a upstream server pointing the k8s_gateway
I mentioned above. Adguard Home
listens on my MANAGEMENT
, SERVER
, IOT
and GUEST
networks on port 53
. In my firewall rules I have NAT port redirection forcing all the networks to use the Adguard Home
DNS server.
Without much engineering of DNS @home, these options have made my Opnsense
router a single point of failure for DNS. I believe this is ok though because my router should have the most uptime of all my systems.
External DNS
external-dns is deployed in my cluster and configure to sync DNS records to Cloudflare. The only ingresses external-dns
looks at to gather DNS records to put in Cloudflare
are ones that I explicitly set an annotation of external-dns/is-public: "true"
Dynamic DNS
My home IP can change at any given time and in order to keep my WAN IP address up to date on Cloudflare. I have deployed a CronJob in my cluster, this periodically checks and updates the A
record ipv4.domain.tld
.
🔧 Hardware
Device | Count | OS Disk Size | Data Disk Size | Ram | Operating System | Purpose |
---|---|---|---|---|---|---|
Protectli FW6D | 1 | 500GB mSATA | N/A | 16GB | Opnsense 22 | Router |
Intel NUC8i3BEK | 3 | 256GB NVMe | N/A | 32GB | Ubuntu 22.04 | Kubernetes (k3s) Masters |
Intel NUC8i5BEH | 3 | 240GB SSD | 1TB NVMe (rook-ceph) | 64GB | Ubuntu 22.04 | Kubernetes (k3s) Workers |
PowerEdge T340 | 1 | 2TB SSD | 8x12TB ZFS RAIDz2 | 64GB | Ubuntu 22.04 | Apps (Minio, Nexus, etc) & NFS |
Lenovo SA120 | 1 | N/A | 8x12TB | N/A | N/A | DAS |
Raspberry Pi | 1 | 32GB SD Card | N/A | 4GB | PiKVM | Network KVM |
TESmart 8 Port KVM Switch | 1 | N/A | N/A | N/A | N/A | Network KVM switch for PiKVM |
APC SMT1500RM2U w/ NIC | 1 | N/A | N/A | N/A | N/A | UPS |
CyberPower PDU41001 | 2 | N/A | N/A | N/A | N/A | PDU |
🤝 Graditude and Thanks
Thanks to all the people who donate their time to the Kubernetes @Home community. A lot of inspiration for my cluster came from the people that have shared their clusters over at awesome-home-kubernetes.
📜 Changelog
See commit history
🔏 License
See LICENSE