aws-eks-accelerator-for-terraform

Main Purpose

This project provides a framework for deploying best-practice multi-tenant EKS Clusters with Kubernetes Addons, provisioned via Hashicorp Terraform and Helm charts on AWS.

Overview

The AWS EKS Accelerator for Terraform module helps you to provision EKS Clusters, Managed node groups with On-Demand and Spot Instances, AWS Fargate profiles, and all the necessary Kubernetes add-ons for a production-ready EKS cluster. The Terraform Helm provider is used to deploy common Kubernetes Addons with publicly available Helm Charts. This module also provides the integration for number AWS services like Amazon Managed Prometheus, EMR on EKS etc. This project leverages the community terraform-aws-eks modules to EKS Cluster.

The intention of this framework is to help design config driven solution. This will help you to create EKS clusters for various environments and AWS accounts across multiple regions with a unique Terraform configuration and state file per EKS cluster.

main.tf - EKS Cluster and Amazon EKS Addon resources.
aws-eks-worker.tf - Amazon Managed node groups, Self-managed nodes, AWS EKS Fargate profiles resources
kubernetes-addons.tf - contains resources to deploy Kubernetes Addons using Helm and Kubernetes provider.
modules - folder contains AWS resource sub modules used in this module.
kubernetes-addons - folder contains Helm charts and Kubernetes resources for deploying Kubernetes Addons.
deploy - folder contains example to deploy EKS cluster with multiple node groups and Kubernetes add-ons

EKS Cluster Deployment Options

This module provisions the following EKS resources

EKS Cluster resources

EKS Cluster with multiple networking options
Amazon EKS Addons
Managed Node Groups with On-Demand - AWS Managed Node Groups with On-Demand Instances
Managed Node Groups with Spot - AWS Managed Node Groups with Spot Instances
AWS Fargate Profiles - AWS Fargate Profiles
Self-managed Node Group with Windows support - Ability to create a self-managed node group for Linux or Windows workloads.
Launch Templates - Launch templates available to Managed Node Groups and Self-managed Node Groups
Bottlerocket OS - Managed and Self-managed Node Groups with Bottlerocket OS and Launch Templates
Amazon Managed Service for Prometheus (AMP) - AMP makes it easy to monitor containerized applications at scale
Amazon EMR on Amazon Elastic Kubernetes Service (EKS)

Kubernetes Addons

Kubernetes Addons deployed using Helm Charts and Kubernetes Resources

Metrics Server
Cluster Autoscaler
AWS LB Ingress Controller
Traefik Ingress Controller
Ngnix Ingress Controller
FluentBit for Node Groups
FluentBit for Fargate Containers
Agones - Host, Run and Scale dedicated game servers on Kubernetes
Prometheus
Kube-state-metrics
Alert-manager
Prometheus-node-exporter
Prometheus-pushgateway
Cert Manager
spark-k8s-operator

Node Group Modules

This module uses dedicated sub modules for creating AWS Managed Node Groups, Self-managed Node groups and Fargate profiles. These modules provide flexibility to add or remove managed/self-managed node groups/fargate profiles by simply adding/removing map of values to input config. See example.

The aws-auth ConfigMap handled by this module allow your nodes to join your cluster, and you also use this ConfigMap to add RBAC access to IAM users and roles. Each Node Group can have dedicated IAM role, Launch template and Security Group to improve the security.

Please refer this full example

EKS Cluster Deployment Example

module "aws-eks-accelerator-for-terraform" {
  source = "git@github.com:aws-samples/aws-eks-accelerator-for-terraform.git"

  tenant            = "management"        # aws account alias
  environment       = "preprod"
  zone              = "dev"

  vpc_id             = "" # Enter VPC ID
  private_subnet_ids = [] # Enter Private Subnet IDs

  create_eks         = true
  kubernetes_version = "1.21"

  enable_managed_nodegroups = true
  managed_node_groups = {
    mg_4 = {
      node_group_name = "managed-ondemand"
      instance_types  = ["m4.large"]
      subnet_ids      = [] # Enter Private Subnet IDs
    }
  }

}

Managed Node Groups Example

    enable_managed_nodegroups = true

    managed_node_groups = {
      mg_m4 = {
        # 1> Node Group configuration
        node_group_name             = "managed-ondemand"
        create_launch_template      = true              # false will use the default launch template
        launch_template_os          = "amazonlinux2eks" # amazonlinux2eks or windows or bottlerocket
        public_ip                   = false             # Use this to enable public IP for EC2 instances; only for public subnets used in launch templates ;
        pre_userdata           = <<-EOT
                yum install -y amazon-ssm-agent
                systemctl enable amazon-ssm-agent && systemctl start amazon-ssm-agent"
            EOT
        # 2> Node Group scaling configuration
        desired_size    = 3
        max_size        = 3
        min_size        = 3
        max_unavailable = 1 # or percentage = 20

        # 3> Node Group compute configuration
        ami_type       = "AL2_x86_64"             # AL2_x86_64, AL2_x86_64_GPU, AL2_ARM_64, CUSTOM
        capacity_type  = "ON_DEMAND"              # ON_DEMAND or SPOT
        instance_types = ["m4.large"]             # List of instances used only for SPOT type
        disk_size      = 50

        # 4> Node Group network configuration
        subnet_ids  = []                          # Mandatory - # Define private/public subnets list with comma separated ["subnet1","subnet2","subnet3"]
        k8s_taints = []
        k8s_labels = {
          Environment = "preprod"
          Zone        = "dev"
          WorkerType  = "ON_DEMAND"
        }
        additional_tags = {
          ExtraTag    = "m4-on-demand"
          Name        = "m4-on-demand"
          subnet_type = "private"
        }
        create_worker_security_group = true
      }
    }

Self-managed Node Groups Example

  enable_self_managed_nodegroups = true

  self_managed_node_groups = {
    self_mg_4 = {
      node_group_name    = "self-managed-ondemand"
      create_launch_template = true
      launch_template_os = "amazonlinux2eks"       # amazonlinux2eks  or bottlerocket or windows
      custom_ami_id      = "ami-0dfaa019a300f219c" # Bring your own custom AMI generated by Packer/ImageBuilder/Puppet etc.
      public_ip          = false                   # Enable only for public subnets
      pre_userdata       = <<-EOT
            yum install -y amazon-ssm-agent \
            systemctl enable amazon-ssm-agent && systemctl start amazon-ssm-agent \
        EOT

      disk_size     = 20
      instance_type = "m5.large"
      desired_size = 2
      max_size     = 10
      min_size     = 2
      capacity_type = "" # Optional Use this only for SPOT capacity as  capacity_type = "spot"

      k8s_labels = {
        Environment = "preprod"
        Zone        = "test"
        WorkerType  = "SELF_MANAGED_ON_DEMAND"
      }

      additional_tags = {
        ExtraTag    = "m5x-on-demand"
        Name        = "m5x-on-demand"
        subnet_type = "private"
      }
      subnet_ids  = module.aws_vpc.private_subnets
      create_worker_security_group = false
    },

  }

Fargate Profile Example

    enable_fargate = true

    fargate_profiles = {
      default = {
        fargate_profile_name = "default"
        fargate_profile_namespaces = [{
          namespace = "default"
          k8s_labels = {
            Environment = "preprod"
            Zone        = "dev"
            env         = "fargate"
          }
        }]

        subnet_ids = [] # Mandatory - # Define private subnets list with comma separated ["subnet1","subnet2","subnet3"]

        additional_tags = {
          ExtraTag = "Fargate"
        }
      }
    }

Kubernetes Addon Example

The following example deploys the Addons with default configuration.

metrics_server_enable = true            # Deploys Metrics Server Addon

cluster_autoscaler_enable = true        # Deploys Cluster Autoscaler Addon

prometheus_enable = true                # Deploys Prometheus Addon

This module also allows you to override values.yaml from consumer module.

  # Optional Map value
  metrics_server_helm_chart = {
    name           = "metrics-server"
    repository     = "https://kubernetes-sigs.github.io/metrics-server/"
    chart          = "metrics-server"
    version        = "3.5.0"
    namespace      = "kube-system"
    timeout        = "1200"

    # (Optional) Example to pass metrics-server-values.yaml from your local repo
    values = [templatefile("${path.module}/k8s_addons/metrics-server-values.yaml", {
      operating_system                = "linux"
    })]
  }

Bottlerocket OS

Bottlerocket is an open source operating system specifically designed for running containers. Bottlerocket build system is based on Rust. It's a container host OS and doesn't have additional software's or package managers other than what is needed for running containers hence its very light weight and secure. Container optimized operating systems are ideal when you need to run applications in Kubernetes with minimal setup and do not want to worry about security or updates, or want OS support from cloud provider. Container operating systems does updates transactionally.

Bottlerocket has two containers runtimes running. Control container on by default used for AWS Systems manager and remote API access. Admin container off by default for deep debugging and exploration.

Bottlerocket Launch templates userdata uses the TOML format with Key-value pairs. Remote API access API via SSM agent. You can launch trouble shooting container via user data [settings.host-containers.admin] enabled = true.

Features

Secure - Opinionated, specialized and highly secured
Flexible - Multi cloud and multi orchestrator
Transactional - Image based upgraded and rollbacks
Isolated - Separate container Runtimes

Updates

Bottlerocket can be updated automatically via Kubernetes Operator

    kubectl apply -f Bottlerocket_k8s.csv.yaml
    kubectl get ClusterServiceVersion Bottlerocket_k8s | jq.'status'

How to Deploy

Prerequisites:

Ensure that you have installed the following tools in your Mac or Windows Laptop before start working with this module and run Terraform Plan and Apply

Deployment Steps

The following steps walks you through the deployment of example DEV cluster configuration. This config deploys a private EKS cluster with public and private subnets. One managed node group and fargate profile for default namespace placed in private subnets. ALB placed in Public subnets created by AWS LB Ingress controller. It also deploys few kubernetes apps i.e., AWS LB Ingress Controller, Metrics Server and Cluster Autoscaler.

Step1: Clone the repo

git clone https://github.com/aws-samples/aws-eks-accelerator-for-terraform.git

Step2: (Optional) Update example main.tf file

Update local variables deploy/eks-cluster-with-new-vpc/main.tf or keep it as default You can choose to use an existing VPC ID and Subnet IDs or create a new VPC and subnets by providing CIDR ranges.

Step3: Set AWS profile or Assume IAM role

This role will become the Kubernetes Admin by default. Please see this document for assuming a role

Step4: Run Terraform INIT

To initialize a working directory with configuration files

cd deploy/eks-cluster-with-new-vpc/
terraform init

Step5: Run Terraform PLAN

To verify the resources created by this execution

terraform plan

Step6: Finally, Terraform APPLY

to create resources

terraform apply

Configure kubectl and test cluster

EKS Cluster details can be extracted from terraform output or from AWS Console to get the name of cluster. This following command used to update the kubeconfig in your local machine where you run kubectl commands to interact with your EKS Cluster.

Step7: Run update-kubeconfig command.

~/.kube/config file gets updated with cluster details and certificate from the below command

$ aws eks --region <region> update-kubeconfig --name <cluster-name>

Step8: List all the worker nodes by running the command below

$ kubectl get nodes

Step9: List all the pods running in kube-system namespace

$ kubectl get pods -n kube-system

Advanced Deployment Folder Structure

This example shows how to structure folders in your repo when you want to deploy multiple EKS Clusters across multiple regions and accounts.

The top-level deploy\advanced folder provides an example of how you can structure your folders and files to define multiple EKS Cluster environments and consume this accelerator module. This approach is suitable for large projects, with clearly defined sub directory and file structure. This can be modified the way that suits your requirement. You can define a unique configuration for each EKS Cluster and making this module as central source of truth.

Each folder under live/<region>/application represents an EKS cluster environment(e.g., dev, test, load etc.). This folder contains backend.conf and <env>.tfvars, used to create a unique Terraform state for each cluster environment. Terraform backend configuration can be updated in backend.conf and cluster common configuration variables in <env>.tfvars

e.g. folder/file structure for defining multiple clusters

    ├── deploy\advanced
    │   └── live
    │       └── preprod
    │           └── eu-west-1
    │               └── application
    │                   └── dev
    │                       └── backend.conf
    │                       └── dev.tfvars
    │                       └── main.tf
    │                       └── variables.tf
    │                       └── outputs.tf
    │                   └── test
    │                       └── backend.conf
    │                       └── test.tfvars
    │       └── prod
    │           └── eu-west-1
    │               └── application
    │                   └── prod
    │                       └── backend.conf
    │                       └── prod.tfvars
    │                       └── main.tf
    │                       └── variables.tf
    │                       └── outputs.tf

Important Note

If you are using an existing VPC then you need to ensure that the following tags added to the VPC and subnet resources

Add Tags to VPC

    Key = "Kubernetes.io/cluster/${local.cluster_name}"
    Value = "Shared"

Add Tags to Public Subnets tagging requirement

    public_subnet_tags = {
      "Kubernetes.io/cluster/${local.cluster_name}" = "shared"
      "Kubernetes.io/role/elb"                      = "1"
    }

Add Tags to Private Subnets tagging requirement

    private_subnet_tags = {
      "Kubernetes.io/cluster/${local.cluster_name}" = "shared"
      "Kubernetes.io/role/internal-elb"             = "1"
    }

For fully Private EKS clusters requires the following VPC endpoints to be created to communicate with AWS services.

com.amazonaws.region.aps-workspaces            - For AWS Managed Prometheus Workspace
com.amazonaws.region.ssm                       - Secrets Management
com.amazonaws.region.ec2
com.amazonaws.region.ecr.api
com.amazonaws.region.ecr.dkr
com.amazonaws.region.logs                       – For CloudWatch Logs
com.amazonaws.region.sts                        – If using AWS Fargate or IAM roles for service accounts
com.amazonaws.region.elasticloadbalancing       – If using Application Load Balancers
com.amazonaws.region.autoscaling                – If using Cluster Autoscaler
com.amazonaws.region.s3                         – Creates S3 gateway

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

tenhishadow/aws-eks-accelerator-for-terraform