This playbook generates a cluster overview report for an Open Shift 3 Cluster.
- Requires no additional dependency
- Generates a stand-alone HTML file
- Light-weight and fast
- Very few oc commands are run on the cluster, data is processed on the ansible node.
- Modular design, new report components can be easily added
If you would like to request a new component/features in the report, please open a new github issue.
Install the required python-jmespath
package:
sudo yum install python-jmespath -y
Clone/copy this repository on the ansible node of the cluster:
git clone https://github.com/kxr/ocpcr.git
Change into the directory:
cd ocpcr
And run the cluster-report.yaml playbook
ansible-playbook cluster-report.yaml
If the ansible inventory of the cluster is in a non-default location, specify it with -i:
ansible-playbook cluster-report.yaml -i /root/ocp311-inventory
Once the playbook finishes, it will generate the report in a HTML file OpenShift_Cluster_Overview_Report_<date>.html
The playbook can email the report to one or more recipents as well.
Set the email_report:
to True and provide the SMTP configuration in cluster-report.yaml
email_report: True
email_from: cluster-report@email.com (OpenShift Cluster Overview)
email_recipients:
- khizer@email.com
- someone@email.com
smtp_host: smtp.gmail.com
smtp_port: 587
smtp_user: smtp_auth_user@email.com
smtp_pass: xxxxxx
- Check for orphaned pods
- Compare the master-config.yaml
- Elasticsearch status, indexes
- Warning events in the core namespaces
- Quick certificate check
- Gluster stroage check
- Pending CSR check
- Core pods (registry, es, prometheus etc.) using persistent storage check
- Latency check of core services (api, registry, router) from all nodes
- All pods on each node using hostpath/localvolumes
- Persistent volumes check
- Service serving certificate check
- Disk usage of important directories (/var, /var/lib/docker /var/lib/etcd etc.)
- Top 10 namespaces using CPU/Memory requests/limits
- Handle CRI-O
- Show network info i.e, host IPs, oc-hostsubnets
- DNS on nodes/pods check
- Proxy on nodes/pods check
- Users/serviceaccounts with admin/cluster-admin privileges
- Imagestream check (deployments using old istreams)
- OVS flow, bridges, interfaces check
- Create proper document/wiki
- Use bootstrap's datatables to allow for interactive table filtering.
This is work in progress and I will be adding more components to the report. To understand the architecture of the playbook / jinja templating, please see the wiki section.
Please feel free to write new components and send a pull request.