This repository contains various Ansible playbooks, templates, and other support files used to provision OpenShift environments onto AWS.
In order to use these scripts, you will need to set a few things up.
- An AWS IAM account with the following permissions:
- Policies can be defined for Users, Groups or Roles
- Navigate to: AWS Dashboard -> Identity & Access Management -> Select Users or Groups or Roles -> Permissions -> Inline Policies -> Create Policy -> Custom Policy
- Policy Name: openshift (your preference)
- Policy Document:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1459269951000",
"Effect": "Allow",
"Action": [
"cloudformation:*",
"iam:*",
"route53:*",
"elasticloadbalancing:*",
"ec2:*",
"cloudwatch:*",
"autoscaling:*",
"s3:*"
],
"Resource": [
"*"
]
}
]
}
Finer-grained permissions are possible, and pull requests are welcome.
- AWS credentials for the account above must be used with the AWS command line tool (detailed below)
- A route53 public hosted
zone
is required for the scripts to create the various DNS entries for the
resources it creates. Two DNS entries will be created for workshops:
master.guid.domain.tld
- a DNS entry pointing to the master*.cloudapps.guid.domain.tld
- a wildcard DNS entry pointing to the router/infrastructure node
- An EC2 SSH keypair should be created in advance and you should save the key file to your system.
- A Red Hat Customer Portal account that has appropriate OpenShift subscriptions
- Red Hat employee subscriptions can be used
- Python version 2.7.x (3.x untested and may not work)
- Python Boto version 2.41 or greater
- Ansible version 2.1.2 or greater
Python and the Python dependencies may be installed via your OS' package manager (eg: python2-boto on Fedora/CentOS/RHEL) or via pip. Python virtualenv can also work.
The bu-workshop
files stand up an environment running on Amazon Web
Services. They use CloudFormations, EC2, VPC, and Route 53
services within AWS. They provision several RHEL7-based servers that are
participating in an OpenShift 3
environment that has persistent storage for its infrastructure components.
Additionally, the scripts set up OpenShift's metrics and logging aggregation services.
Lastly, the scripts set up and configure various workshop services, users, and volumes for those users
When using the "bu-workshop" playbooks, the following holds true:
- one master
- one infrastructure node
- twenty four (24) "application" nodes
- one nfs server
- one bastion host
- GitLab
- Nexus (although unused in labs due to performance / scalability in large workshops)
- Workshop lab guide built via S2I
You will need to place your EC2 credentials in the ~/.aws/credentials file:
[default]
aws_access_key_id = foo
aws_secret_access_key = bar
If your operating system has an SSH agent and you are not using your default configured SSH key, you will need to add the private key you use with your EC2 instances to your SSH agent:
ssh-add <path to key file>
Note that if you use an SSH config that specifies what keys to use for what hosts this step may not be necessary.
Each "environment" has two vars files _vars
and _secret_vars
in the
Environment
folder. The example_secret_vars
file shows the format for what
to put in your bu-workshop_secret_vars
file, if you were using the
bu-workshop
playbook.
The bu-workshop_vars
file contains most of the configuration settings to use
in the environment. Really the only ones you should expect to modify are the
domain-related and number of (workshop) user options. All AMIs and sizing is
preconfigured and automatic for the AWS region you deploy into.
Additionally, you will need to edit the HostedZoneId
in the CloudFormation
template to correspond to your own DNS zone.
Once you have installed your prerequisites and have configured all settings and files, simply run Ansible like so:
ansible-playbook -i 127.0.0.1 ansible/bu-workshop.yml -e "config=bu-workshop" -e "aws_region=us-west-1" -e "guid=atlanta"
Be sure to exchange guid
for a sensible prefix of your choosing.
If you want more or less nodes, you can pass in the num_nodes
variable when
calling ansible-playbook
with the value you desire.
You must select the correct AWS region.
An S3 bucket is used to back the Docker registry. AWS will not let you delete a
non-empty S3 bucket, so you must do this manually. The aws
CLI makes this
easy:
aws s3 rm s3://bucket-name --recursive
Your bucket name is named {{config}}-{{guid}}
. So, in the case of a
bu-workshop
environment where you provided the guid
of "Atlanta", your S3
bucket is called bu-workshop-atlanta
.
Just go into your AWS account to the CloudFormation section in the region where you provisioned, find the deployed stack, and delete it.
This Ansible script places entries into your ~/.ssh/config
. It is recommended
that you remove them once you are done with your environment.
When you simply throw away the CloudFormation stack, the systems stay attached to your subscription. Go to access.redhat.com, manage your subs and remove the systems manually.
Information will be added here as problems are solved. So far it's pretty vanilla, but quite slow. Expect at least an hour for deployment, if not two or more if you are far from the system(s).
It has been seen that, on occasion, EC2 is generally unstable. This manifests in various ways:
-
The autoscaling group for the nodes takes an extremely long time to deploy, or will never complete deploying
-
Individual EC2 instances may have terrible performance, which can result in nodes that seem to be "hung" despite being reachable via SSH.
There is not much that can be done in this circumstance besides starting over (in a different region).
While Ansible is idempotent and supports being re-run, there are some known issues with doing so. Specifically:
-
You should skip the tag
nfs_tasks
with the--skip-tags
option if you re-run the playbook after the NFS server has been provisioned and configured. The playbook is not safe for re-run and will fail. -
You may also wish to skip the tag
bastion_proxy_config
when re-running, as the tasks associated with this play will re-write the same entries to your SSH config file, which could result in hosts becoming unexpectedly unreachable.