Build a latest Kubernetes (K8S) non-HA cluster in AWS (CentOS or RHEL) using kubeadm to explore K8S. There are multiple K8S/AWS deployment tools (kops, rancher, etc) and kubeadm is not yet production ready tool but it will be the one. Note that this is to explore K8S, hence basic security considerations.
Simple 1 master + 2 workers (can be increased by a parameter) in a VPC subnet, to be created by the Ansible playbooks.
Ansible playbooks and inventories under the Git repository.
.
├── cluster <---- K8S cluster installation home (AWS+K8S)
│ ├── ansible <---- Ansible playbook directory
│ │ ├── aws
│ │ │ ├── ec2
│ │ │ │ ├── creation <---- Module to setup AWS
│ │ │ │ └── operations
│ │ │ ├── conductor.sh
│ │ │ └── player.sh
│ │ └── k8s
│ │ ├── 01_prerequisite <---- Module to setup Ansible pre-requisites
│ │ ├── 02_os <---- Module to setup OS to install K8S
│ │ ├── 03_k8s_setup <---- Module to setup K8S cluster
│ │ ├── 04_k8s_configuration <---- Module to configure K8S after setup
│ │ ├── 10_datadog <---- Module to setup datadog monitoring (option)
│ │ ├── 20_applications <---- Module for sample applications
│ │ ├── conductor.sh <---- Script to conduct playbook executions
│ │ └── player.sh <---- Playbook player
│ ├── conf
│ │ └── ansible <---- Ansible configuration directory
│ │ ├── ansible.cfg <---- Configurations for all plays
│ │ └── inventories <---- Each environment has it inventory here
│ │ ├── aws <---- AWS/K8S environment inventory
│ │ └── template
│ └── tools
├── master <---- K8S master node data for run_k8s.s created by run_aws.sh or update manally.
├── run.sh <---- Run run_aws.sh and run_k8s.sh
├── run_aws.sh <---- Run AWS setups
└── run_k8s.sh <---- Run K8S setups
Module is a set of playbooks and roles to execute a specific task e.g. 03_k8s_setup is to setup a K8S cluster. Each module directory has the same structure having Readme, Plays, and Scripts.
03_k8s_setup/
├── Readme.md <---- description of the module
├── plays
│ ├── roles
│ │ ├── common <---- Common tasks both for master and workers
│ │ ├── master <---- Setup master node
│ │ ├── pki <---- Patch up K8S CA on master
│ │ ├── user <---- Setup K8S administrative users on master
│ │ ├── worker <---- Setup worker nodes
│ │ ├── helm <---- Setup Helm package manager
│ │ └── dashboard <---- Setup K8S Dashboard
│ ├── site.yml
│ ├── masters.yml <--- playbook for master node
│ └── workers.yml <--- playbook for worker nodes
└── scripts
└── main.sh <---- script to run the module (each module can run separately/repeatedly)
To be able to user realpath.
brew install coreutils
Clone this.
Have AWS access key_id, secret, and an AWS SSH keypair PEM file. MFA should not be used (or make sure to establish a session before execution).
Install AWS CLI and set the environment variables.
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- EC2_KEYPAIR_NAME
- REMOTE_USER <---- AWS EC2 user (centos for CentOs, ec2-user for RHEL)
Have Ansible (2.4.1 or later) and Boto to be able to run AWS ansible features. If the host is RHEL/CentOS/Ubuntu, run below will do the job.
(cd ./cluster/ansible/k8s/01_prerequisite/scripts && ./setup.sh)
Test the Ansible dynamic inventory script.
conf/ansible/inventories/aws/inventory/ec2.py
Configure ssh-agent and/or .ssh/config with the AWS SSH PEM to be able to SSH into the targets without providing pass phrase. Create a test EC2 instance and test.
eval $(ssh-agent)
ssh-add <AWS SSH pem>
ssh ${REMOTE_USER}@<EC2 server> sudo ls # no prompt for asking password
Create a Datadog trial account and set the environment variable DATADOG_API_KEY to the Datadog account API_KEY. The Datadog module setups the monitors/metrics to verify that K8S is up and running, and can start monitoring and setup alerts right away.
Set environment (or shell) variable TARGET_INVENTORY=aws. The variable identifies the Ansible inventory aws (same with ENV_ID in env.yml) to use.
Run ./run.sh to run all at once (create AWS IAM policy/role, VPC, subnet, router, SG, EC2, ..., setup K8S cluster and applications) or go through the configurations and executions step by step below.
Parameters for an environment are all isolated in group_vars of the environment inventory. Go through the group_vars files to set values.
.
├── conf
│ └── ansible
│ ├── ansible.cfg
│ └── inventories
│ └── aws
│ ├── group_vars
│ │ ├── all <---- Configure properties in the 'all' group vars
│ │ │ ├── env.yml <---- Enviornment parameters e.g. ENV_ID to identify and to tag configuration items
│ │ │ ├── server.yml <---- Server parameters e.g. location of kubelet configuration file
│ │ │ ├── aws.yml <---- e.g. AMI image id, volume type, etc
│ │ │ ├── helm.yml <---- Helm package manager specifics
│ │ │ ├── kube_state_metrics.yml
│ │ │ └── datadog.yml
│ │ ├── masters <---- For master group specifics
│ │ └── workers
│ └── inventory
│ ├── ec2.ini
│ ├── ec2.py
│ └── hosts <---- Target node(s) using tag values (set upon creating AWS env)
Set the AWS SSH keypair nameto EC2_KEYPAIR_NAME enviornment variable and in aws.yml.
Set the default Linux account (centos for CentOS EC2) that can sudo without password as the Ansible remote_user to run the playbooks If using another account, configure it and make sure it can sudo without password and configure .ssh/config.
Set the inventory name aws to ENV_ID in env.yml which is used to tag the configuration items in AWS (e.g. EC2). The tags are then used to identify configuration items that belong to the enviornment, e.g. EC2 dynamic inventory hosts.
Set private AWS DNS name and IP of the master node instance. If run_aws.sh is used, it creates a file master which includes them and run_k8s.sh uses them. Otherwise set them in env.yml and as environment variables after having created the AWS instances.
- K8S_MASTER_HOSTNAME
- K8S_MASTER_NODE_IP
Set an account name to K8S_ADMIN in server.yml. The account is created by a playbook via LINUX_USERS in server.yml. Set an encrypted password in the corresponding field. Use mkpasswd as explained in Ansible document.
Make sure the environment variables are set, and
Environment variables:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- EC2_KEYPAIR_NAME
- REMOTE_USER
- DATADOG_API_KEY
Set TARGET_INVENTORY=aws variable which identifies the Ansible inventory aws (same with ENV_ID) to use.
.
├── cluster
├── maintenance.sh
├── master
├── run.sh
├── run_aws.sh <--- Run this script.
└── run_k8s.sh
In the directory, run run_k8s.sh. If DATADOG_API_KEY is not set, the 10_datadog module will cause errors.
.
├── cluster
├── maintenance.sh
├── master <---- Make sure master node information is set in this file
├── run.sh
├── run_aws.sh
└── run_k8s.sh <---- Run this script
Alternatively, run each module one by one, and skip 10_datadog if not using.
pushd ansible/k8s/<module>/scripts && ./main.sh or
ansible/k8s/<module>/scripts/main.sh aws <ansible remote_user>
Modules are:
├── 01_prerequisite <---- Module to setup Ansible pre-requisites
├── 02_os <---- Module to setup OS to install K8S
├── 03_k8s_setup <---- Module to setup K8S cluster
├── 04_k8s_configuration <---- Module to configure K8S after setup
├── 10_datadog <---- Module to setup datadog monitoring (option)
├── 20_applications <---- Module for sample applications
├── conductor.sh <---- Script to conduct playbook executions
└── player.sh <---- Playbook player
The Ansible playbook of 20_applications shows the EXTERNAL-IP for the guestbook application.
"msg": [
"NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR",
"frontend LoadBalancer 10.104.46.88 aa8886b1f2f0f11e8a4ec06dfe7a500c-1694803115.us-west-1.elb.amazonaws.com 80:32110/TCP 43s app=guestbook,tier=frontend"
]
Access http://EXTERNAL-IP and it should show the page:
Login to the Datadog and check its dashboard. 10_datadog module has setup tests to verify K8S API server, etcd, kubelet. Datadog agent pods reports metrics from kube-state-metrics pod in the cluster, and from cAdvisor via kubelet, hence the cluster health and events such as pod killed by OOM can be verified.
Make sure each node has correct hostname set and it can be resolved in all nodes. Otherwise there can be issues that K8S node cannot join the cluster although kubeadm join says success.
As of release Kubernetes 1.8.0, kubelet will not work with enabled swap. You have two choices: either disable swap or add to kubelet flag to continue working with enabled swap.
Make sure to provide the POD network CIDR range to kubeadm and it aligns with that specified in the Flannel manifest.
kubeadm init --pod-network-cidr=10.244.0.0/16
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"type": "flannel",
"delegate": {
"isDefaultGateway": true
}
}
net-conf.json: |
{
"Network": "10.244.0.0/16", <----
"Backend": {
"Type": "vxlan"
}
}
Make sure to specify the correct IP. If to use hostname, make sure it will not be resolved to a NAT address in VM environments.
Cloud provider needs to be specified to kubeadm at the cluster configuration. If not specified, it would require re-installing the cluster to use the feature GitHub 57718.
kubeadm init --config kubeadm_config.yaml
Instead of using --cloud-provider=aws to kubeadm, use kubeadm configuration. --cloud-provider=aws used to work but there are several reports it causes issues and the kubeadm init documentation does not specify it although the manifest part shows it.
kubeadm_config.yaml
kind: MasterConfiguration
apiVersion: kubeadm.k8s.io/v1alpha1
api:
advertiseAddress: {{ APISERVER_ADVERTISE_ADDRESS }}
networking:
podSubnet: {{ K8S_SERVICE_ADDRESSES }} <---- POD network CIDR 10.244.0.0/16
cloudProvider: {{ K8S_CLOUD_PROVIDER }} <---- aws
kubeadm reset does not clean up completely. Need to manually delete directories/files and Pod network interfaces. See Failed to setup network for pod \ using network plugins "cni": no IP addresses available in network: podnet; Skipping pod".
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ifconfig docker0 down
ip link delete cni0
ip link delete flannel.1
Make sure kubelet will be in the same cgroups of docker so that kubelet can talk with docker daemon.
kubelet --runtime-cgroups=<docker cgroup> --kubelet-cgroups <docker cgroup>
For K8S pods to be able to access the host files, need to align or relabel the files, or need to configure POD security contexts. To avoid these steps, for this experimental K8s deployment, disable SELinux. DO NOT in real environments.
/etc/sysconfig/selinux
SELINUX=disabled
Turn off the firewalld as K8S uses iptables to re-route the access to services to the backend pods.
sudo systemctl --now disable firewalld
sudo systemctl stop firewalld
- Using kubeadm to Create a Cluster
- Installing kubeadm
- Troubleshooting kubeadm
- GitHub Kubeadm Design Documents
- How kubeadm Initializes Your Kubernetes Master
Kubernetes has the concept of a Cloud Provider, which is a module which provides an interface for managing TCP Load Balancers, Nodes (Instances) and Networking Routes.