An Ansible role to deploy the essentials of a highly available Kubernetes cluster. Includes in-cluster apiserver and ingress controller loadbalancing, dns and flannel.
It was originally loosely based on the canonical kelseyhightower/kubernetes-the-hard-way runbooks, but completely Ansible-ised, and not GCP-specific.
- In common, it downloads and runs the controller componenets (apiserver, controller-manager and scheduler) outside the cluster as systemd services. Similarly, for kubelets on the worker nodes.
- However:
- etcd runs in separate VMs
- The apiserver nodes are load-balanced using cloud-specific tools:
- Libvirt / ESXi: keepalived (with IPVS real-servers on the same hosts as the directors). This means the host that is owner of the VIP receives the request, but hands it off in the kernel for processing by one of the apiservers.
- AWS: Network load balancers, configured for internal load-balancing. One per-zone for resilience.
- haproxy-ingress is used as an ingress controller. It runs as a daemonset on special node-edge worker nodes with hostNetwork.
It supports (at present) AWS, libvirt(KVM/Qemu) and ESXi infrastructure.
This project is designed to operate using clusterverse to provision and manage the base VM infrastructure. Please see the README.md there for instructions on deployment. There is an EXAMPLE folder that can be copied as a new project root.
Contributions are welcome and encouraged. Please see CONTRIBUTING.md for details.
It is only tested on Ubuntu 22.04 at present.
- It is non-trivial to set up username/password access to a remote libvirt host, so we use an ssh key instead.
- Your ssh user should be a member of the
libvirt
andkvm
groups. - Store the config in
cluster_vars.libvirt
- Username & password for a privileged user on an ESXi host
- SSH must be enabled on the host
- Set the
Config.HostAgent.vmacore.soap.maxSessionCount
variable to 0 to allow many concurrent tests to run. - Set the
Security.SshSessionLimit
variable to max (100) to allow as many ssh sessions as possible. - You need a template VM. gold-img-build-esxi can be used if needed.
- DNS is optional. If set, you will need a DNS server of either nsupdate (bind9), AWS route53 or GCP CloudDNS.
- Store the config in
cluster_vars.esxi
- VPC and subnets configured
- IAM role with access/secret key.
- Route53 private Hosted zone (it probably could be run publicly, by setting
cluster_vars.assign_public_ip: true
andcluster_vars.assign_public_ip: public
, but this would be highly insecure.- DNS is mandatory, because the NLBs do not provide a globally unique IP, and the APIservers need a load-balancer with either a single IP or a DNS name.
Clusters are defined as code within Ansible yaml files that are imported at runtime. Because clusters are built from scratch on the localhost, the automatic Ansible group_vars
inclusion cannot work with anything except the special all.yml
group (actual groups
need to be in the inventory, which cannot exist until the cluster is built). The group_vars/all.yml
file is instead used to bootstrap merge_vars, and the definitions are hierarchically defined in cluster_defs. Please see the full documentation in the main clusterverse/README.md
kubernetes-essentials-ansible is an Ansible role, and as such must be imported into your project's /roles directory. There is a full-featured example in the /EXAMPLE subdirectory.
To import the role into your project, create a requirements.yml
file containing:
roles:
- name: clusterverse
src: https://github.com/dseeley/clusterverse
version: master ## branch, hash, or tag
- name: kubernetes
src: https://github.com/dseeley/kubernetes-essentials-ansible
version: master ## branch, hash, or tag
-
If you use a
cluster.yml
file similar to the example found in EXAMPLE/cluster.yml, clusterverse will be installed from Ansible Galaxy automatically on each run of the playbook. -
To install it manually:
ansible-galaxy install -r requirements.yml -p ./roles
For full clusterverse invocation examples and command-line arguments, please see the example README.md
The role is designed to run in two modes:
- A playbook based on the cluster.yml example will be needed.
- The
cluster.yml
sub-role idempotently deploys a cluster from the config defined above (if it is run again (with no changes to variables), it will do nothing). If the cluster variables are changed (e.g. add a host), the cluster will reflect the new variables (e.g. a new host will be added to the cluster. Note: it will not remove nodes, nor, usually, will it reflect changes to disk volumes - these are limitations of the underlying cloud modules). - Example:
ansible-playbook cluster.yml -e cloud_type=libvirt -e region=dougalab -e buildenv=dev -e testapps=true
- A playbook based on the redeploy.yml example will be needed.
- The
redeploy.yml
sub-role will completely redeploy the cluster; this is useful for example to upgrade the underlying operating system version. - Please see the full documentation
- Example:
ansible-playbook redeploy.yml -e canary=none -e cloud_type=esxifree -e clusterid=dougakube -e region=dougalab -e buildenv=dev -e testapps=true