/terraform-deploykf

Terraform module for deploying deployKF (Kubeflow distribution)

Primary LanguageHCLApache License 2.0Apache-2.0

terraform-deploykf

terraform-deploykf Terraform module for deploying deployKF (Kubeflow distribution).


It's 100% Open Source and licensed under the APACHE2.

Introduction

This terraform module is a work in progress. It's intent to provide the plumbing requirements for deployKF and provision it end-to-end for different use cases. User guides will be added in due course for common setups as they become supported.

Requires an existing kubernetes cluster.

Usage

Currently a fully automated e2e deployment is not yet possible, however generally the steps are:

  1. Call the module in your terraform state, e.g.:
module "deploy-kf" {
  source = "git::https://github.com/flaccid/terraform-deploykf"

  app_of_apps_values  = file("app-of-apps-values.yaml")
  argocd_helm_values  = <<EOF
configs:
  params:
    server.insecure: true

server:
  ingress:
    enabled: true
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-production
      ingress.kubernetes.io/ssl-redirect: 'true'
      nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
      nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    ingressClassName: nginx
    hosts:
      - argocd.mydomain.com
    tls:
      - hosts:
          - argocd.mydomain.com
        secretName: argocd-mydomain-com-tls
EOF
  install_app_of_apps = true
}

Populate app-of-apps-values.yaml with the values you need per your use case and environment.

  1. Sync the argocd apps per https://www.deploykf.org/guides/getting-started/
  2. Profit

On AWS

To provision for AWS, set variables create_s3_buckets and create_iam_resources to true.

Additionally, set provision_rds_instance to true if desiring to provision and use an RDS instance.

Argo CD UI

Login with the default username (admin) and password retrieved using:

echo $(kubectl -n argocd get secret/argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d)

Kubeflow UI

TBA

Upstream Documentation

https://www.kubeflow.org/docs/started/installing-kubeflow/#packaged-distributions-of-kubeflow

deployKF

https://www.deploykf.org/guides/getting-started/

Pertinent links for this implementation

Testing

Populate any required variables in test/terraform.tfvars

cd test
export KUBE_CONFIG_PATH=~/.kube/config
terraform init
terraform apply

Makefile Targets

Available targets:

  help                                Help screen
  help/all                            Display help for all targets
  help/short                          This help short screen

Requirements

Name Version
terraform >= 1.2.7
aws >= 5.31.0
helm >= 2.12.0
kubernetes >= 2.25.0

Providers

Name Version
aws 5.31.0
helm 2.12.1
http 3.4.1
kubernetes 2.25.2
null 3.2.2
random 3.6.0

Modules

Name Source Version
irsa-ebs-csi terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc 4.7.0
kubeflow-eks-cluster terraform-aws-modules/eks/aws 19.15.3
kubeflow-mysql-instance cloudposse/rds/aws 1.1.0

Resources

Name Type
aws_eks_addon.ebs-csi resource
aws_iam_policy.kubeflow-storage resource
aws_iam_role.kubeflow resource
aws_iam_role_policy_attachment.kubeflow-storage resource
aws_route53_record.argocd-server resource
aws_s3_bucket.kubeflow-pipelines resource
helm_release.argo-cd resource
kubernetes_config_map.argocd-deploykf-plugin resource
kubernetes_manifest.app-of-apps resource
kubernetes_namespace.argo-cd resource
kubernetes_namespace.deploykf-auth resource
kubernetes_namespace.kubeflow resource
kubernetes_persistent_volume_claim.argocd-deploykf-plugin-assets resource
kubernetes_secret.kubeflow-db-credentials resource
null_resource.patch-argocd-repo-server resource
random_string.kubeflow_db_password resource
aws_availability_zones.available data source
aws_caller_identity.current data source
aws_eks_cluster.kubeflow data source
aws_iam_policy.ebs_csi_policy data source
aws_route53_zone.kubeflow data source
http_http.argocd-deploykf-plugin data source

Inputs

Name Description Type Default Required
addons Manages aws_eks_addon resources.
list(object({
addon_name = string
addon_version = string
resolve_conflicts = string
service_account_role_arn = string
}))
[] no
app_of_apps_values Application values for the 'app of apps' argocd application string "" no
apply_config_map_aws_auth Whether to apply the ConfigMap to allow worker nodes to join the EKS cluster and allow additional users, accounts and roles to acces the cluster bool false no
argocd_chart_version n/a string "5.51.6" no
argocd_helm_values Helm values for the argocd deployment string "" no
argocd_namespace n/a string "argocd" no
cluster_encryption_config_enabled Set to true to enable Cluster Encryption Configuration bool true no
cluster_encryption_config_kms_key_deletion_window_in_days Cluster Encryption Config KMS Key Resource argument - key deletion windows in days post destruction number 10 no
cluster_encryption_config_kms_key_enable_key_rotation Cluster Encryption Config KMS Key Resource argument - enable kms key rotation bool true no
cluster_encryption_config_kms_key_id KMS Key ID to use for cluster encryption config string "" no
cluster_encryption_config_kms_key_policy Cluster Encryption Config KMS Key Resource argument - key policy string null no
cluster_encryption_config_resources Cluster Encryption Config Resources to encrypt, e.g. ['secrets'] list(any)
[
"secrets"
]
no
cluster_log_retention_period Number of days to retain cluster logs. Requires enabled_cluster_log_types to be set. See https://docs.aws.amazon.com/en_us/eks/latest/userguide/control-plane-logs.html. number 0 no
cluster_name The name of the cluster to deploy kubeflow to (and to create if chosen) string "kubeflow" no
create_argocd_namespace n/a bool true no
create_databases n/a bool false no
create_deploykf_auth_namespace n/a bool true no
create_eks_cluster n/a bool false no
create_iam_resources n/a bool false no
create_kubeflow_namespace n/a bool true no
create_s3_buckets n/a bool false no
create_zone_records n/a bool false no
deploykf_repo_ref n/a string "v0.1.3" no
deploykf_repo_url n/a string "https://github.com/thesuperzapper/deployKF.git" no
desired_size Desired number of worker nodes number 2 no
disk_size Disk size in GiB for worker nodes. Defaults to 20. Terraform will only perform drift detection if a configuration value is provided number 20 no
enabled_cluster_log_types A list of the desired control plane logging to enable. For more information, see https://docs.aws.amazon.com/en_us/eks/latest/userguide/control-plane-logs.html. Possible values [api, audit, authenticator, controllerManager, scheduler] list(string) [] no
existing_eks_cluster Use an existing eks cluster bool false no
hosted_zone_id Route 53 hosted zone name for dns records any null no
hosted_zone_private Whether the hosted zone is private bool false no
install_app_of_apps n/a bool false no
install_argocd n/a bool true no
instance_types Set of instance types associated with the EKS Node Group. map
{
"default": "t3.small",
"rancher-0": "t3.small",
"rancher-1": "t3.small"
}
no
kubeflow_database_name n/a string "kubeflow" no
kubeflow_database_password The kubeflow database password string "" no
kubeflow_database_user n/a string "kubeflow" no
kubeflow_iam_role_arn The IAM role ARN to use for kubeflow (optional) any null no
kubernetes_config_map_ignore_role_changes Set to true to ignore IAM role changes in the Kubernetes Auth ConfigMap bool true no
kubernetes_labels Key-value mapping of Kubernetes labels. Only labels that are applied with the EKS API are managed by this argument. Other Kubernetes labels applied to the EKS Node Group will not be managed map(string) {} no
kubernetes_version Desired Kubernetes master version. If you do not specify a value, the latest available version is used string "1.26" no
local_exec_interpreter shell to use for local_exec list(string)
[
"/bin/sh",
"-c"
]
no
map_additional_iam_roles Additional IAM role mappings to add to aws-auth configmap list
[
{
"groups": [
"system:masters"
],
"rolearn": "arn:aws:iam::631165420711:role/AWSReservedSSO_CloudAdmin_g_2e075014822efad3",
"username": "cloud-admin-g"
},
{
"groups": [
"system:masters"
],
"rolearn": "arn:aws:iam::631165420711:role/PhxBuildAgentRole",
"username": "phx-build-agent-role"
}
]
no
max_size The maximum size of the AutoScaling Group number 3 no
min_size The minimum size of the AutoScaling Group number 2 no
mysql_host Hostname to connect to for the mysql database server string "localhost" no
node_role_arn IAM role ARN to use for nodes in the worker group(s) string "arn:aws:iam::631165420711:role/cp-core-eks-nodegroup-default-NodeInstanceRole" no
oidc_provider_enabled Create an IAM OIDC identity provider for the cluster, then you can create IAM roles to associate with a service account in the cluster, instead of using kiam or kube2iam. For more information, see https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html bool true no
pipelines_bucket_name n/a string "kubeflow-pipelines" no
provision_eks provision eks cluster and node group bool false no
provision_rds_instance n/a bool false no
rds_allocated_storage Size in gigabytes to allocate to the rds instance number 20 no
rds_allow_major_version_upgrade Allow major version upgrade bool false no
rds_apply_immediately Specifies whether any database modifications are applied immediately, or during the next maintenance window bool false no
rds_auto_minor_version_upgrade Allow automated minor version upgrade bool true no
rds_backup_retention_period Backup retention period in days. Must be > 0 to enable backups number 0 no
rds_backup_window When AWS can perform DB snapshots, can't overlap with maintenance window string "22:00-03:00" no
rds_copy_tags_to_snapshot Copy tags from DB to a snapshot bool true no
rds_database_port n/a number 3306 no
rds_db_parameter_group n/a string "mysql8.0" no
rds_engine n/a string "mysql" no
rds_engine_version n/a string "8.0.35" no
rds_instance_class Instance type for the rds instance string "db.t3.small" no
rds_kms_key_arn The KMS Key ARN used for encryption with the storage any null no
rds_maintenance_window The window to perform maintenance in. Syntax: 'ddd:hh24:mi-ddd:hh24:mi' UTC string "Mon:03:00-Mon:04:00" no
rds_major_engine_version n/a string "8.0" no
rds_name_prefix Prefix string to use for the RDS database name string "kubeflow-poc" no
rds_skip_final_snapshot If true (default), no snapshot will be made before deleting DB bool false no
rds_subnet_ids VPC subnet IDs for the rds instance list [] no
rds_vpc_id VPC ID for the rds instance any null no
region n/a string "us-east-1" no
storage_class_name n/a any null no
subnet_ids List of subnet IDs to use for the eks cluster list(string) null no
vpc_id ID of VPC to deploy eks cluster to any null no

Outputs

No outputs.

✨ Contributing

This project is under active development, and we encourage contributions from our community. Many thanks to our outstanding contributors:

🐛 Bug Reports & Feature Requests

Please use the issue tracker to report any bugs or file feature requests.

💻 Developing

In general, PRs are welcome. We follow the typical "fork-and-pull" Git workflow.

  1. Fork the repo on GitHub
  2. Clone the project to your own machine
  3. Commit changes to your own branch
  4. Push your work back up to your fork
  5. Submit a Pull Request so that we can review your changes

NOTE: Be sure to merge the latest changes from "upstream" before making a pull request!

License

License

See LICENSE for full details.

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.

Trademarks

All other trademarks referenced herein are the property of their respective owners.

Copyright © 2023-2024 Chris Fordham