Cloud-deploy-kubenow

This page will guide you to set up a PhenoNeNal CRE on Amazon, Google Cloud, Microsoft Azure or in a public or private OpenStack environment through the command-line. Normally, you would use the convenient PhenoMeNal portal to launch a CRE on the supported cloud providers, which under the hood is using the procedure below. But in special cases (private OpenStack, or for developers) you want to use the infrastructure provisioning procedure without the web GUI.

Prerequisites

There are some tools that you need installed on your local machine, in order to provision Phenomenal-KubeNow:

Git to clone/download the install scripts from github repo
Docker to run the container with all other dependencies

Get Phenomenal-KubeNow

Phenomenal-KubeNow are distributed via GitHub:

# the repository contains submodules therefore `--recursive` parameter when cloning e.g.
git clone --recursive https://github.com/phnmnl/cloud-deploy-kubenow.git

cd cloud-deploy-kubenow

# If you later want to pull latest version and also pull latest submodule updates:

git pull --recurse-submodules
git submodule update --recursive --remote

All of the commands in this documentation are meant to be run in the cloud-deploy-kubenow directory.

Deploy on Amazon Web Services

Amazon specific prerequisites

You have an IAM user along with its access key and security credentials (http://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html)

Configuration

Start by creating your configuration file: config.aws.sh There is a template that you can use for your convenience:

mv config.aws.sh-template config.aws.sh

In this configuration file you will need to set:

Cluster

TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix
TF_VAR_aws_access_key_id: your access key id
TF_VAR_aws_secret_access_key: your secret access key id
TF_VAR_aws_region: the region where your cluster will be bootstrapped (e.g. eu-west-1)
TF_VAR_availability_zone: an availability zone for your cluster (e.g. eu-west-1a)

Master configuration

TF_VAR_master_instance_type: an instance flavor for the master
TF_VAR_master_as_edge:

Node configuration

TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
TF_VAR_node_instance_type: an instance flavor name for the Kubernetes nodes

Gluster configuration

TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
TF_VAR_glusternode_instance_type: an instance flavor for the glusternodes
TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

TF_VAR_edge_count: number of egde nodes to be created
TF_VAR_edge_instance_type: an instance flavor for the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

TF_VAR_minio_release_name: release name for the Minio service
TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
TF_VAR_minio_accesskey: access key for the S3 endpoint
TF_VAR_minio_secretkey: secret key for the S3 endpoint
TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy aws

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

Please note that the client version should correspond with the pachd service version. For more information please consult: http://pachyderm.readthedocs.io/en/latest/index.html. You can see an example on how to create pipelines here: https://github.com/pachyderm/pachyderm/tree/master/doc/examples/gatk

and to destroy use:

./phenomenal.sh destroy aws

Deploy on Google Cloud Platform

Google cloud specific prerequisites

You have enabled the Google Compute Engine API: API Manager > Library > Compute Engine API > Enable
You have created and downloaded a service account file for your GCE project: Api manager > Credentials > Create credentials > Service account key
You installed python package apache-libcloud and jmespath (e.g. sudo pip install apache-libcloud jmespath)

Configuration

Start by creating your configuration file: config.gcp.sh There is a template that you can use for your convenience:

mv config.gcp.sh-template config.gcp.sh

In this configuration file you will need to set:

Cluster

TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix
TF_VAR_gce_credentials_file: path to your service account file
TF_VAR_gce_region: the zone for your project (e.g. europe-west1-b)
TF_VAR_gce_project: your project id

Master configuration

TF_VAR_master_flavor: an instance flavor for the master
TF_VAR_master_as_edge:

Node configuration

TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
TF_VAR_node_flavor: an instance flavor name for the Kubernetes nodes

Gluster configuration

TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
TF_VAR_glusternode_flavor: an instance flavor for the glusternodes
TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

TF_VAR_edge_count: number of egde nodes to be created
TF_VAR_edge_iflavor: an instance flavor for the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

TF_VAR_minio_release_name: release name for the Minio service
TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
TF_VAR_minio_accesskey: access key for the S3 endpoint
TF_VAR_minio_secretkey: secret key for the S3 endpoint
TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy gcp

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

and to destroy use:

./phenomenal.sh destroy gcp

Deploy on Openstack

Openstack specific prerequisites

You have downloaded the OpenStack RC file (credentials) for your tenancy: https://docs.openstack.org/user-guide/common/cli-set-environment-variables-using-openstack-rc.html#download-and-source-the-openstack-rc-file

Configuration

Start by creating your configuration file: config.ostack.sh There is a template that you can use for your convenience:

mv config.ostack.sh-template config.ostack.sh

In this configuration file you will need to set:

Cluster

TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix
TF_VAR_os_credentials_file: your openstack credentials file: https://docs.openstack.org/user-guide/common/cli-set-environment-variables-using-openstack-rc.html#download-and-source-the-openstack-rc-file
TF_VAR_floating_ip_pool: a floating IP pool name
TF_VAR_external_network_uuid: the uuid of the external network in the OpenStack tenancy
TF_VAR_dns_nameservers: (optional, only needed if you want to use other dns-servers than default 8.8.8.8 and 8.8.4.4)

Master configuration

TF_VAR_master_flavor: an instance flavor for the master
TF_VAR_master_as_edge:

Node configuration

TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
TF_VAR_node_flavor: an instance flavor name for the Kubernetes nodes

Gluster configuration

TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
TF_VAR_glusternode_flavor: an instance flavor for the glusternodes
TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

TF_VAR_edge_count: number of egde nodes to be created
TF_VAR_edge_flavor: an instance flavor for the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

TF_VAR_minio_release_name: release name for the Minio service
TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
TF_VAR_minio_accesskey: access key for the S3 endpoint
TF_VAR_minio_secretkey: secret key for the S3 endpoint
TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy ostack

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

and to destroy use:

./phenomenal.sh destroy ostack

Deploy on Microsoft Azure

Azure specific prerequisites

You have created an application API key (Service Principal) in your Microsoft Azure subscription: (https://www.terraform.io/docs/providers/azurerm/authenticating_via_service_principal.html#creating-a-service-principal)

Configuration

Start by creating your configuration file: config.azure.sh There is a template that you can use for your convenience:

mv config.azure.sh-template config.azure.sh

In this configuration file you will need to set:

Cluster

TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix
TF_VAR_location: some Azure location (e.g. West Europe)
TF_VAR_subscription_id: your subscription id
TF_VAR_client_id: your client id (also called appId)
TF_VAR_client_secret: your client secret (also called password)
TF_VAR_tenant_id: your tenant id

Master configuration

TF_VAR_master_vm_size: the vm size for the master (e.g. Standard_DS2_v2) (e.g. Standard_DS2_v2)
TF_VAR_master_as_edge:

Node configuration

TF_VAR_node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
TF_VAR_node_vm_size: the vm size for the Kubernetes nodes (e.g. Standard_DS2_v2)

Gluster configuration

TF_VAR_glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
TF_VAR_glusternode_vm_size: the vm size for the glusternodes
TF_VAR_glusternode_extra_disk_size: disk size of the fileserver size in GB

Edge configuration (optional)

TF_VAR_edge_count: number of egde nodes to be created
TF_VAR_edge_vm_size: the vm size for the the edge nodes

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

TF_VAR_jupyter_password: password for your notebook

Pachyderm + Minio (optional)

TF_VAR_minio_release_name: release name for the Minio service
TF_VAR_minio_pvc_size: storage dedicated for the Minio service (In GB)
TF_VAR_minio_accesskey: access key for the S3 endpoint
TF_VAR_minio_secretkey: secret key for the S3 endpoint
TF_VAR_pachyderm_release_name: a release name for the Pachyderm service
TF_VAR_pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
TF_VAR_pachyderm_minio_accesskey: access key of the S3 endpoint
TF_VAR_pachyderm_minio_secretkey: secret key of the S3 endpoint

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy azure

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = by ssh-ing onto your master node, and installing the Pachyderm client:

curl -o /tmp/pachctl.deb -L https://github.com/pachyderm/pachyderm/releases/download/v1.6.6/pachctl_1.6.6_amd64.deb && sudo dpkg -i /tmp/pachctl.deb

and to destroy use:

./phenomenal.sh destroy azure

Deploy on Local Machine (Linux-KVM)

Openstack specific prerequisites

You are running Linux with KVM-enabled kernel
You have installed

Configuration

Start by creating your configuration file: config.ostack.sh There is a template that you can use for your convenience:

mv config.ostack.sh-template config.ostack.sh

In this configuration file you will need to set:

Cluster

TF_VAR_cluster_prefix: every resource in your tenancy will be named with this prefix
TF_VAR_os_credentials_file: your openstack credentials file: https://docs.openstack.org/user-guide/common/cli-set-environment-variables-using-openstack-rc.html#download-and-source-the-openstack-rc-file
TF_VAR_floating_ip_pool: a floating IP pool name
TF_VAR_external_network_uuid: the uuid of the external network in the OpenStack tenancy
TF_VAR_dns_nameservers: (optional, only needed if you want to use other dns-servers than default 8.8.8.8 and 8.8.4.4)

Master configuration

TF_VAR_master_flavor: an instance flavor for the master
TF_VAR_master_as_edge:

Local file server - Ubuntu.....

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

TF_VAR_use_cloudflare: wether you want to use cloudflare as dns provider
TF_VAR_cloudflare_email: the mail that you used to register your Cloudflare account
TF_VAR_cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
TF_VAR_cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)

Galaxy

TF_VAR_galaxy_admin_email: the local galaxy admin (you?)
TF_VAR_galaxy_admin_password: min 6 characters admin password

Jupyter

TF_VAR_jupyter_password: password for your notebook

Once you are done with your settings you are ready to deploy the cluster:

./phenomenal.sh deploy kvm

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>

and to destroy use:

./phenomenal.sh destroy kvm

Directories and files

├── cloud_portal            # This is where the cloud portal deploy.sh, destroy.sh and state.sh scripts
│   │                       # are stored in subdirectories per cloud provider
│   │
│   ├── aws                 # Sub directories per cloud provider
│   ├── gcp
│   ├── ostack
│   ├── azure
│   └── shared              # The bulk part of the deploy.sh, destroy.sh and state.sh are identical between
│                           # provides and is residing in a shared version of the scripts called from the
│                           # provider speciffic scripts
│
│
├── KubeNow                 # This is the standard KubeNow git repo included as a git sub-module and this is
│                           # where the terraform and default KubeNow ansible scripts reside (called from the
│                           # deploy.sh, destroy.sh and state.sh scripts)
│
│
├── playbooks               # Ansible playbooks that are Phenomenal release speciffic and not included in the
│                           # default KubeNow repository
│
│
├── bin                     # Utility script that are used in the deploy.sh, destroy.sh and state.sh scripts
│
│
├ manifest.json             # This is the TSI parameter file used to describe the setup
│
│
├ config.openstack.sh-template     # Includes vars expected to be provided from web-ui and only used for local deployment
│
│
├ config.aws.sh-template           # Amazon version of deployment vars
│
│
└ config.gcp.sh-template           # Google cloud version of deployment vars

jonandernovella/cloud-deploy-kubenow

Cloud-deploy-kubenow

Prerequisites

Get Phenomenal-KubeNow

Deploy on Amazon Web Services

Deploy on Google Cloud Platform

Deploy on Openstack

Deploy on Microsoft Azure

Deploy on Local Machine (Linux-KVM)

Directories and files