/OnDemand_Kubernetes

Open OnDemand Connector for AWS EKS (Managed Kubernetes service)

Primary LanguageShellBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

OnDemand kubernetes

ISC23 Paper and presentation

Paper

Slides

Deploy EKS and Cognito

Prerequisites

  1. AWS Credentials AWS IAM User or an EC2 with a Role. The policy for the IAM User or the EC2 Role should be suffecient to trigger API calls to AWS Resources.
    The resources need to be accessed are:
  • CloudFormation
  • EC2
  • IAM
  • Cognito
  • EKS
  • AutoScaling
  • SSM GetParameter
  1. IAM Service user this user is different than the AWS Credentials above. This user have credentials to access cognito and eks only. And the user credentials should be setup on the open on demand head node later after the EKS is deployed.

  2. VPC and two subnets where the EKS cluster and its nodes will be hosted

  3. SSH KEY to login to the EKS nodes if needed (for debugging for example).

  4. eks/deploy.sh requirements more requirements can be found in the eks/README.md

From the machine that have the AWS Credentials with EC, CF, IAM Read, Cognito and EKS access, deploy cognito and EKS as follows:

  1. Deploy cognito
    Cognito is used by EKS for user authentication therefore it need to be deployed first. It should be deployed using cloudformation because the CF stack name will be used
    later to get cognito attributes that will be used by EKS.
#aws cloudformation create-stack --region=$REGION   --stack-name ${stackname}   --template-body file://PATH/TO/COGNITO/TEMPLATE
aws cloudformation create-stack --region=us-east-1 --stack-name ood-cognito-1  --template-body file://cloudFormation/cognito.yaml
  1. Deploy EKS
    After createing cognito use eks/deploy.sh script to create the EKS cluster and it node group. Note that deploy.sh need other requirements which can be found in eks/README.md
cd eks

#  If .env file is availabe
#  ./deploy.sh CLUSTER_NAME
./deploy.sh my-ood-eks

#  If .env is not available
# ./deploy.sh CLUSTER_NAME REGION VPCTAG SUB1TAG SUB2TAG SSH_KEY IAM_USER OOD_CIDR COGNITO-STACK-NAME gpu|general
./deploy.sh my-ood-eks us-east-1 atood-dev-standard atood-dev-standard-app-pvt-1a atood-dev-standard-app-pvt-1b eks ood-dev-eks-cognito 10.31.0.0/16 ood-cognito-1 general
  1. ./deploy.sh will create openondemand config files. You need to move those files to the openondemand server that will use the newly created EKS cluster. Files generated by the deploy.sh are hook.env and k8s_cluster.yml and kubernetes-ca.crt. Those files are in eks/clusters_deployed/CLUSTERNAME_DATE/cluster_config/

  2. On the open On Demand server, and after copying the files to the server, run ood-installation.sh with the required variables as documented in the code.

Delete (Cleanup) EKS and Cognito

  1. To delete the EKS, use delete.sh in eks folder, Please check the eks/README.md for more information of how to clean up the EKS custer and its associated resources.

  2. To remove cognito you can use

aws cloudformation delete-stack --region=$REGION --stack-name $COGNITO_STACK-NAME

replace the $REGION with your region (us-east-1) and $COGNITO_STACK-NAME with your cognito stack name (as created in cloudformation).

Some Troubleshooting notes

If the user on the openondemand server is unable to run a pod please check the user ~/USERNAME/.kube/config as sometime the config have empty cluster

apiVersion: v1
clusters: null
contexts:
- context:
    cluster: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
    user: fadel
  name: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
current-context: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
kind: Config

When it should look like this

apiVersion: v1
clusters:
- cluster:
    certificate-authority: /etc/pki/tls/certs/kubernetes-ca.crt
    server: https://FE212345AFSD352345SDDSG35345SDF7.xl5.us-east-1.eks.amazonaws.com
  name: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
contexts:
- context:
    cluster: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
    user: fadel
  name: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
current-context: arn:aws:eks:us-east-1:123456789123:cluster/faras-test-deploy-cmd
kind: Config
.
.

This should be fixed automatically in the hook file that create and configure cognito.