/terraform-talos

Terraform examples to run Talos

Primary LanguageHCLMIT LicenseMIT

Terraform examples to launch Talos.

This repository was created to store Terraform code for launching Talos in the clouds/bare-metal. When I added features/cloud platform integrations to Talos, I needed to run tests manually. To make things easier, I created this repository.

There are no GitOps best practices here - no FluxCD, ArgoCD, or other GitOps tools. Each step is applied manually because I need to test everything to ensure it works as expected.

  • I chose not to use Terraform modules from the internet; the goal here is to build all cloud services from scratch.
  • I don’t maintain backward compatibility and always use the latest versions of Terraform and cloud provider tools.
  • Kubernetes isn’t fully ready for multi-cloud environments, as many components were designed for single-environment setups. So did some changes to each cloud provider controllers to improve compatibility. (like CCM, CSI, etc.)
  • The Talos CCM project was created to make multi-cloud setups more cloud-native, addressing some common issues in multi-cloud environments.

Some examples are production ready, and I’ve been using them with minor adjustments to fit company’s needs. In most cases in my production setup, I use two or more cloud providers within a single Kubernetes cluster.

Everything here is under the MIT license. Feel free to clone, copy the code. If this project helps you, please give it a star. It helps me to understand how many people are interested in this project/ideas. And it motivates me to keep working on it. Your support encourages me to add/sync new features.

Ideas

First, I will create separate clusters on each cloud provider, test them thoroughly, and bring them close to production readiness. When I merge these separate Kubernetes clusters into one, they will have a single control plane.

Why is it so important?

Having a single Kubernetes control plane that spans multiple cloud providers can offer several benefits:

  • Improved resilience and availability: By using multiple cloud providers, you can reduce the risk of downtime due to cloud provider outages or other issues.
  • Flexibility: A single control plane allows you to easily move workloads between different cloud providers, depending on your needs.
  • Cost savings: You can take advantage of the different pricing models and discounts offered by different cloud providers to save on costs.
  • Improved security: By using multiple cloud providers, you can implement a defense-in-depth strategy to protect your data and reduce the risk of a security breach.
  • Decrease the time to recovery (TTR)

Clouds

Platform Checked Talos version Addons Setup type Nat-IPv4 IPv6 Pod with global IPv6
Azure 1.3.4 CCM,CSI,Autoscaler many regions, many zones
Exoscale 1.3.0 CCM,Autoscaler many regions
GCP 1.3.4 CCM,CSI,Autoscaler one region, many zones
Hetzner 1.7.6 CCM,CSI,Autoscaler many regions, one network zone
Openstack 1.3.4 CCM,CSI many regions, many zones
Oracle 1.3.4 CCM,CSI,Autoscaler one region, many zones
Proxmox 1.8.2 CCM,CSI one region, mny zones
Scaleway 1.7.6 CCM,CSI one region

Known issues

  • Talos does not support upstream Oracle CSI, use my fork

Multi cloud compatibility

CCM controllers have different modes:

  • Talos CCM in mode: cloud-node
  • Other CCMs in mode: cloud-node-lifecycle

CCM compatibility has been tested in multi-cloud setups, and in most cases, they work well together.

Azure GCP Hetzner Openstack Proxmox
Azure
Exoscale
GCP
Hetzner
Openstack
Proxmox

Common

  • cilium network with vxlan tunnels.
  • ingress-nginx (daemonsets) runs on web role nodes. It uses hostNetwork ports 80,443 for optimizations. It helps me to tweak the kernel on a host and apply it to ingress controller. And I can disable conntrack too.
  • coredns-local (daemonsets) uses dummy interface on al nodes and has ip 169.254.2.53 It decrease the dns response (all traffic does not leave the node).
  • rancher.io/local-path as default storage class.

The common deployoment you can find in _deployments folder.

References