-
Internet connected physical host running a modern linux distribution
-
Virtualization enabled and Libvirt/KVM setup (more details)
-
DNS on the host managed by
dnsmasq
orNetworkManager/dnsmasq
(more details) -
OpenShift 4 Pull Secret (Download from here)
-
helm
- Helm Installation
bash <(curl -sSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3)
- Helm Installation
-
yq
:BINARY=yq_linux_amd64 LATEST=$(wget -qO- https://api.github.com/repos/mikefarah/yq/releases/latest 2>/dev/null | grep browser_download_url | grep $BINARY\"\$|awk '{print $NF}' ) sudo wget -q $LATEST -O /usr/bin/yq && sudo chmod +x /usr/bin/yq
RHEL DNF Repositories require RH Account credentials via subscription-manager. To enable run:
UN=user01
PW=MySuperSecurePassword123
subscription-manager register --username ${UN} --password ${PW}
dnf install qemu-kvm libvirt virt-install virt-viewer
dnf install qemu-img libvirt-client libguestfs-tools-c
## Unmask and enable services
for drv in qemu interface network nodedev nwfilter secret storage; do
systemctl unmask virt${drv}d.service;
systemctl unmask virt${drv}d{,-ro,-admin}.socket;
systemctl enable virt${drv}d.service;
systemctl enable virt${drv}d{,-ro,-admin}.socket;
done
## Start services
for drv in qemu network nodedev nwfilter secret storage interface; do
systemctl start virt${drv}d{,-ro,-admin}.socket
done
(Optional) Enabled remote host virtualization proxy daemon
systemctl unmask virtproxyd.service
systemctl unmask virtproxyd{,-ro,-admin}.socket
systemctl enable virtproxyd.service
systemctl enable virtproxyd{,-ro,-admin}.socket
systemctl start virtproxyd{,-ro,-admin}.socket
AWS EC2 instances that would fit running x2 clusters
For each cluster, you need the following resources:
- 1 Bootstrap node:
4 vCPUs
/16 GiB
- 3 Master nodes:
4 vCPUs
/16 GiB
each - 2 Worker nodes:
4 vCPUs
/16 GiB
each - 1 Load balancer node:
4 vCPUs
/8 GiB
memory
This gives a total per cluster of:
- Total vCPUs = 1 x 4 (Bootstrap) + 3 x 4 (Masters) + 2 x 4 (Workers) + 1 x 4 (Load Balancer) = 32 vCPUs
- Total Memory = 1 x 16 GiB (Bootstrap) + 3 x 16 GiB (Masters) + 2 x 16 GiB (Workers) + 1 x 8 GiB (Load Balancer) = 104 GiB
Since you are deploying two clusters, the total resource requirements across both clusters are:
- Total vCPUs for two clusters = 32 x 2 = 64 vCPUs
- Total memory for two clusters = 104 GiB x 2 = 208 GiB
m5.24xlarge:
- 96 vCPUs
- 384 GiB memory
This would provide sufficient resources for both clusters with overhead for OpenShift operations.
m5.metal:
- 96 vCPUs
- 384 GiB memory
Offers bare metal performance with the same vCPU and memory resources.
./openshift-libvirt.sh [OPTIONS]
Option | Description |
---|---|
______________________________ | |
-O, --ocp-version VERSION | You can set this to "latest", "stable" or a specific version like "4.1", "4.1.2", "4.1.latest", "4.1.stable" etc. Default: stable |
-R, --rhcos-version VERSION | You can set a specific RHCOS version to use. For example "4.1.0", "4.2.latest" etc. By default the RHCOS version is matched from the OpenShift version. For example, if you selected 4.1.2 RHCOS 4.1/latest will be used |
-p, --pull-secret FILE | Location of the pull secret file Default: /root/pull-secret |
-c, --cluster-name NAME | OpenShift 4 cluster name Default: ocp4 |
-d, --cluster-domain DOMAIN | OpenShift 4 cluster domain Default: local |
-m, --masters N | Number of masters to deploy Default: 3 |
-w, --worker N | Number of workers to deploy Default: 2 |
--master-cpu N | Number of CPUs for the master VM(s) Default: 4 |
--master-mem SIZE(MB) | RAM size (MB) of master VM(s) Default: 16000 |
--worker-cpu N | Number of CPUs for the worker VM(s) Default: 4 |
--worker-mem SIZE(MB) | RAM size (MB) of worker VM(s) Default: 8000 |
--bootstrap-cpu N | Number of CPUs for the bootstrap VM Default: 4 |
--bootstrap-mem SIZE(MB) | RAM size (MB) of bootstrap VM Default: 16000 |
--lb-cpu N | Number of CPUs for the load balancer VM Default: 1 |
--lb-mem SIZE(MB) | RAM size (MB) of load balancer VM Default: 1024 |
-n, --libvirt-network NETWORK | The libvirt network to use. Select this option if you want to use an existing libvirt network The libvirt network should already exist. If you want the script to create a separate network for this installation see: -N, --libvirt-oct Default: default |
-N, --libvirt-oct OCTET | You can specify a 192.168.{OCTET}.0 subnet octet and this script will create a new libvirt network for the cluster The network will be named ocp-{OCTET}. If the libvirt network ocp-{OCTET} already exists, it will be used. Default: [not set] |
-v, --vm-dir | The location where you want to store the VM Disks Default: /var/lib/libvirt/images |
-z, --dns-dir DIR | We expect the DNS on the host to be managed by dnsmasq. You can use NetworkMananger's built-in dnsmasq or use a separate dnsmasq running on the host. If you are running a separate dnsmasq on the host, set this to "/etc/dnsmasq.d" Default: /etc/NetworkManager/dnsmasq.d |
-s, --setup-dir DIR | The location where we the script keeps all the files related to the installation Default: /root/ocp4_setup_{CLUSTER_NAME} |
-x, --cache-dir DIR | To avoid un-necessary downloads we download the OpenShift/RHCOS files to a cache directory and reuse the files if they exist This way you only download a file once and reuse them for future installs You can force the script to download a fresh copy by using -X, --fresh-download Default: /root/ocp4_downloads |
-X, --fresh-download | Set this if you want to force the script to download a fresh copy of the files instead of reusing the existing ones in cache dir Default: [not set] |
-k, --keep-bootstrap | Set this if you want to keep the bootstrap VM. By default bootstrap VM is removed once the bootstraping is finished Default: [not set] |
--autostart-vms | Set this if you want to the cluster VMs to be set to auto-start on reboot Default: [not set] |
-y, --yes | Set this for the script to be non-interactive and continue with out asking for confirmation Default: [not set] |
--destroy | Set this if you want the script to destroy everything it has created Use this option with the same options you used to install the cluster Be carefull this deletes the VMs, DNS entries and the libvirt network (if created by the script) Default: [not set] |
# Deploy OpenShift 4.17.0 cluster
./openshift-libvirt.sh --ocp-version 4.17.0
# Deploy OpenShift 4.17.0 cluster with RHCOS 4.17.1
./openshift-libvirt.sh --ocp-version 4.17.0 --rhcos-version 4.17.1
# Deploy latest OpenShift version with pull secret from a custom location
./openshift-libvirt.sh --pull-secret /root/Downloads/pull-secret --ocp-version latest
# Deploy OpenShift 4.16.latest with custom cluster name and domain
./openshift-libvirt.sh --cluster-name ocp-01 --cluster-domain lab.test.com --ocp-version 4.16.latest
# Deploy OpenShift 4.14.stable on new libvirt network (192.168.155.0/24)
./openshift-libvirt.sh --ocp-version 4.14.stable --libvirt-oct 155
# Destroy the already installed cluster
./openshift-libvirt.sh --cluster-name ocp-01 --cluster-domain lab.test.com --destroy
Once the installation is successful, you will find a add_node.sh
script in the --setup-dir
(default: /root/ocp4_setup_{CLUSTER_NAME}). You can use this to add more nodes to the cluster, post installation.
cd [setup-dir]
./post_scripts/add_node.sh --name [node-name] [OPTIONS]
Option | Description |
---|---|
______________________________ | |
--name NAME | The node name without the domain. For example: If you specify storage-1, and your cluster name is "ocp4" and base domain is "local", the new node would be "storage-1.ocp4.local". Default: [not set] [REQUIRED] |
-c, --cpu N | Number of CPUs to be attached to this node's VM. Default: 2 |
-m, --memory SIZE(MB) | Amount of Memory to be attached to this node's VM. Size in MB. Default: 4096 |
-a, --add-disk SIZE(GB) | You can add additional disks to this node. Size in GB. This option can be specified multiple times. Disks are added in order for example if you specify "--add-disk 10 --add-disk 100", two disks will be added (on top of the OS disk vda) first of 10GB (/dev/vdb) and second disk of 100GB (/dev/vdc). Default: [not set] |
-v, --vm-dir | The location where you want to store the VM Disks. By default the location used by the cluster VMs will be used. |
-N, --libvirt-oct OCTET | You can specify a 192.168.{OCTET}.0 subnet octet and this script will create a new libvirt network for this node. The network will be named ocp-{OCTET}. If the libvirt network ocp-{OCTET} already exists, it will be used. This can be useful if you want to add a node in different network than the one used by the cluster. Default: [not set] |
-n, --libvirt-network NETWORK | The libvirt network to use. Select this option if you want to use an existing libvirt network. By default the existing libvirt network used by the cluster will be used. |
Once the installation is successful, you will find a expose_cluster.sh
script in the --setup-dir
(default: /root/ocp4_setup_{CLUSTER_NAME}). You can use this to expose this cluster so it can be accessed from outside.
cd [setup-dir]
./expose_cluster.sh --method [ firewalld | haproxy ]
If you are running a single cluster on your bare metal machine, you can expose that cluster via firewalld method (port forwarding). If you want to host and access multiple clusters, you can use the haproxy method.
Once you have exposed your cluster(s), you must ensure you have the proper DNS entries available to your external clients. One simple way to do this is to edit the /etc/hosts
file on your client machines such that your exposed cluster endpoints are declared. The output of the .expose_cluster.sh
script will give you an example line you can use for your /etc/hosts
file.
You need to expose a minimum of three endpoints: the OpenShift console, the API endpoint, and the OAuth endpoint. For example, if you installed with the default names (i.e. the cluster name is "ocp4" and the base domain is "local") you will need to expose these three endpoints:
- console-openshift-console.apps.ocp4.local
- api.ocp4.local
- oauth-openshift.apps.ocp4.local
If you will later configure OpenShift to expose its image registry (a typical dev use case that will allow you to push images directly into your cluster), you will need to expose this endpoint as well:
- default-route-openshift-image-registry.apps.ocp4.local
Finally, any custom Route resources you create in your OpenShift cluster will also need to be exposed via DNS.
If you are exposing your cluster using haproxy and SELinux is in Enforcing mode (on the hypervisor), you need to tell it to treat port 6443 as a webport via semanage port -a -t http_port_t -p tcp 6443
. Otherwise, SELinux will not let haproxy listen on port 6443
Similarly is firewalld is enabled, you need to open up the necessary ports via:
firewall-cmd --add-service=http
firewall-cmd --add-service=https
firewall-cmd --add-port=6443/tcp
The output of the expose_cluster.sh --method haproxy
script will remind you about these additional configurations.
By default, if you reboot the host/hypervisor, the VMs will not start up automatically. You can set --autostart-vms
when running the install script that will mark the VMs to auto-start. To see which VMs are set or not set to auto-start you can run virsh list --all --name --autostart
or virsh list --all --name --no-autostart
respectively.
If you want to change/set the autostart behaviour, you can set the VMs to auto-start by running:
for vm in $(virsh list --all --name --no-autostart | grep "<CLUSTER-NAME>"); do
virsh autostart ${vm}
done
Similarly, to disable the auto starting of VMs, you can run:
for vm in $(virsh list --all --name --autostart | grep "<CLUSTER-NAME>"); do
virsh autostart --disable ${vm}
done
Note: Replace <CLUSTER-NAME>
with the cluster name or any matching string to filter out VMs that you want to set/un-set to be auto-started.
When the bootstrap process is complete, the script waits for clusterversion to become ready before the cluster installation is considered completed. During this phase the script just shows the status/message of the the clustervresion operator. You can see different kind of errors which are normal. This is due to the nature of operator reconciliation process. For example:
====> Waiting for clusterversion:
--> Unable to apply 4.5.0-rc.6: an unknown error has occurred: MultipleErr ...
--> Unable to apply 4.5.0-rc.6: an unknown error has occurred: MultipleErr ...
--> Working towards 4.5.0-rc.6: 62% complete
--> Working towards 4.5.0-rc.6: 62% complete
--> Unable to apply 4.5.0-rc.6: an unknown error has occurred: MultipleErr ...
--> Working towards 4.5.0-rc.6: 99% complete
====> Waiting for clusterversion:
--> Working towards 4.3.12: 46% complete
--> Unable to apply 4.3.12: an unknown error has occurred
--> Working towards 4.3.12: 61% complete
--> Unable to apply 4.3.12: an unknown error has occurred
--> Unable to apply 4.3.12: an unknown error has occurred
--> Unable to apply 4.3.12: an unknown error has occurred
Just let it run and hopefully the clusterversion operator will reconcile and become ready eventually.