/cloud-deploy-kubenow

This repository contains scripts to manage the PhenoMeNal cloud CRE.

Primary LanguageShellApache License 2.0Apache-2.0

Cloud-deploy-kubenow

This page will guide you to set up a PhenoNeNal CRE on Amazon, Google Cloud, Microsoft Azure or in a public or private OpenStack environment through the command-line. Normally, you would use the convenient PhenoMeNal portal to launch a CRE on the supported cloud providers, which under the hood is using the procedure below. But in special cases (private OpenStack, or for developers) you want to use the infrastructure provisioning procedure without the web GUI.

Prerequisites

There are some tools that you need installed on your local machine, in order to provision Phenomenal-KubeNow:

  • Git to clone/download the install scripts from github repo
  • Docker to run the container with all other dependencies

Get Phenomenal-KubeNow

Phenomenal-KubeNow are distributed via GitHub:

# the repository contains submodules therefore `--recursive` parameter when cloning e.g.
git clone --recursive https://github.com/phnmnl/cloud-deploy-kubenow.git

cd cloud-deploy-kubenow

# If you later want to pull latest version and also pull latest submodule updates:

git pull --recurse-submodules
git submodule update --recursive --remote

All of the commands in this documentation are meant to be run in the cloud-deploy-kubenow directory.

Deploy on Amazon Web Services

Amazon specific prerequisites

Configuration

Start by creating your configuration file: config.aws.sh There is a template that you can use for your convenience:

mv config.aws.sh-template config.aws.sh

In this configuration file you will need to set:

Cluster

  • TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix

  • TF_VAR_aws_access_key_id: your access key id

  • TF_VAR_aws_secret_access_key: your secret access key id

  • TF_VAR_aws_region: the region where your cluster will be bootstrapped (e.g. eu-west-1)

  • TF_VAR_availability_zone: an availability zone for your cluster (e.g. eu-west-1a)

Master configuration

  • TF_VAR_master_instance_type: an instance flavor for the master
  • TF_VAR_master_as_edge:

Node configuration

  • TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
  • TF_VAR_node_instance_type: an instance flavor name for the Kubernetes nodes

Gluster configuration

  • TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
  • TF_VAR_glusternode_instance_type: an instance flavor for the glusternodes
  • TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

  • TF_VAR_edge_count: number of egde nodes to be created
  • TF_VAR_edge_instance_type: an instance flavor for the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

  • TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
  • TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
  • TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
  • TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

  • TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
  • TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

  • TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

  • TF_VAR_minio_release_name: release name for the Minio service
  • TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
  • TF_VAR_minio_accesskey: access key for the S3 endpoint
  • TF_VAR_minio_secretkey: secret key for the S3 endpoint
  • TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
  • TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
  • TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
  • TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy aws

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

Please note that the client version should correspond with the pachd service version. For more information please consult: http://pachyderm.readthedocs.io/en/latest/index.html. You can see an example on how to create pipelines here: https://github.com/pachyderm/pachyderm/tree/master/doc/examples/gatk

and to destroy use:

./phenomenal.sh destroy aws

Deploy on Google Cloud Platform

Google cloud specific prerequisites

  • You have enabled the Google Compute Engine API: API Manager > Library > Compute Engine API > Enable

  • You have created and downloaded a service account file for your GCE project: Api manager > Credentials > Create credentials > Service account key

  • You installed python package apache-libcloud and jmespath (e.g. sudo pip install apache-libcloud jmespath)

Configuration

Start by creating your configuration file: config.gcp.sh There is a template that you can use for your convenience:

mv config.gcp.sh-template config.gcp.sh

In this configuration file you will need to set:

Cluster

  • TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix

  • TF_VAR_gce_credentials_file: path to your service account file

  • TF_VAR_gce_region: the zone for your project (e.g. europe-west1-b)

  • TF_VAR_gce_project: your project id

Master configuration

  • TF_VAR_master_flavor: an instance flavor for the master
  • TF_VAR_master_as_edge:

Node configuration

  • TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
  • TF_VAR_node_flavor: an instance flavor name for the Kubernetes nodes

Gluster configuration

  • TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
  • TF_VAR_glusternode_flavor: an instance flavor for the glusternodes
  • TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

  • TF_VAR_edge_count: number of egde nodes to be created
  • TF_VAR_edge_iflavor: an instance flavor for the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

  • TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
  • TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
  • TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
  • TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

  • TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
  • TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

  • TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

  • TF_VAR_minio_release_name: release name for the Minio service
  • TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
  • TF_VAR_minio_accesskey: access key for the S3 endpoint
  • TF_VAR_minio_secretkey: secret key for the S3 endpoint
  • TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
  • TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
  • TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
  • TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy gcp

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

Please note that the client version should correspond with the pachd service version. For more information please consult: http://pachyderm.readthedocs.io/en/latest/index.html. You can see an example on how to create pipelines here: https://github.com/pachyderm/pachyderm/tree/master/doc/examples/gatk

and to destroy use:

./phenomenal.sh destroy gcp

Deploy on Openstack

Openstack specific prerequisites

Configuration

Start by creating your configuration file: config.ostack.sh There is a template that you can use for your convenience:

mv config.ostack.sh-template config.ostack.sh

In this configuration file you will need to set:

Cluster

Master configuration

  • TF_VAR_master_flavor: an instance flavor for the master
  • TF_VAR_master_as_edge:

Node configuration

  • TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
  • TF_VAR_node_flavor: an instance flavor name for the Kubernetes nodes

Gluster configuration

  • TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
  • TF_VAR_glusternode_flavor: an instance flavor for the glusternodes
  • TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

  • TF_VAR_edge_count: number of egde nodes to be created
  • TF_VAR_edge_flavor: an instance flavor for the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

  • TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
  • TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
  • TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
  • TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

  • TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
  • TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

  • TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

  • TF_VAR_minio_release_name: release name for the Minio service
  • TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
  • TF_VAR_minio_accesskey: access key for the S3 endpoint
  • TF_VAR_minio_secretkey: secret key for the S3 endpoint
  • TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
  • TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
  • TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
  • TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy ostack

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

Please note that the client version should correspond with the pachd service version. For more information please consult: http://pachyderm.readthedocs.io/en/latest/index.html. You can see an example on how to create pipelines here: https://github.com/pachyderm/pachyderm/tree/master/doc/examples/gatk

and to destroy use:

./phenomenal.sh destroy ostack

Deploy on Microsoft Azure

Azure specific prerequisites

Configuration

Start by creating your configuration file: config.azure.sh There is a template that you can use for your convenience:

mv config.azure.sh-template config.azure.sh

In this configuration file you will need to set:

Cluster

  • TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix

  • TF_VAR_location: some Azure location (e.g. West Europe)

  • TF_VAR_subscription_id: your subscription id

  • TF_VAR_client_id: your client id (also called appId)

  • TF_VAR_client_secret: your client secret (also called password)

  • TF_VAR_tenant_id: your tenant id

Master configuration

  • TF_VAR_master_vm_size: the vm size for the master (e.g. Standard_DS2_v2) (e.g. Standard_DS2_v2)
  • TF_VAR_master_as_edge:

Node configuration

  • TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
  • TF_VAR_node_vm_size: the vm size for the Kubernetes nodes (e.g. Standard_DS2_v2)

Gluster configuration

  • TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
  • TF_VAR_glusternode_vm_size: the vm size for the glusternodes
  • TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

  • TF_VAR_edge_count: number of egde nodes to be created
  • TF_VAR_edge_vm_size: the vm size for the the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

  • TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
  • TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
  • TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
  • TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

  • TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
  • TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

  • TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

  • TF_VAR_minio_release_name: release name for the Minio service
  • TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
  • TF_VAR_minio_accesskey: access key for the S3 endpoint
  • TF_VAR_minio_secretkey: secret key for the S3 endpoint
  • TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
  • TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
  • TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
  • TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy azure

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

Please note that the client version should correspond with the pachd service version. For more information please consult: http://pachyderm.readthedocs.io/en/latest/index.html. You can see an example on how to create pipelines here: https://github.com/pachyderm/pachyderm/tree/master/doc/examples/gatk

and to destroy use:

./phenomenal.sh destroy azure

Deploy on Local Machine (Linux-KVM)

Openstack specific prerequisites

  • You are running Linux with KVM-enabled kernel
  • You have installed

Configuration

Start by creating your configuration file: config.ostack.sh There is a template that you can use for your convenience:

mv config.ostack.sh-template config.ostack.sh

In this configuration file you will need to set:

Cluster

Master configuration

  • TF_VAR_master_flavor: an instance flavor for the master
  • TF_VAR_master_as_edge:
  • Local file server - Ubuntu.....

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

  • TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
  • TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
  • TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
  • TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

  • TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
  • TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

  • TF_VAR_jupyter_password: password for your notebook

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy kvm

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>

and to destroy use:

./phenomenal.sh destroy kvm

Directories and files

├── cloud_portal            # This is where the cloud portal deploy.sh, destroy.sh and state.sh scripts
│   │                       # are stored in subdirectories per cloud provider
│   │
│   ├── aws                 # Sub directories per cloud provider
│   ├── gcp
│   ├── ostack
│   ├── azure
│   └── shared              # The bulk part of the deploy.sh, destroy.sh and state.sh are identical between
│                           # provides and is residing in a shared version of the scripts called from the
│                           # provider speciffic scripts
│
│
├── KubeNow                 # This is the standard KubeNow git repo included as a git sub-module and this is
│                           # where the terraform and default KubeNow ansible scripts reside (called from the
│                           # deploy.sh, destroy.sh and state.sh scripts)
│
│
├── playbooks               # Ansible playbooks that are Phenomenal release speciffic and not included in the
│                           # default KubeNow repository
│
│
├── bin                     # Utility script that are used in the deploy.sh, destroy.sh and state.sh scripts
│
│
├ manifest.json             # This is the TSI parameter file used to describe the setup
│
│
├ config.openstack.sh-template     # Includes vars expected to be provided from web-ui and only used for local deployment
│
│
├ config.aws.sh-template           # Amazon version of deployment vars
│
│
└ config.gcp.sh-template           # Google cloud version of deployment vars