/devops-case

Primary LanguageJinjaMIT LicenseMIT

DevOps Case Study

Requirements

Init cluster

vagrant up

Shutdown cluster

vagrant halt

Destroy cluster

vagrant destroy -f

Get KUBECONFIG

vagrant ssh master -c "sudo cat ~/.kube/config" > ~/.kube/devops-case

Export KUBECONFIG

export KUBECONFIG=~/.kube/devops-case

Cert Exp Service Endpoint

curl https://k8s-cert-exp.ilkerispir.com/

Grafana

curl https://grafana.ilkerispir.com/

Prometheus

curl https://prometheus.ilkerispir.com/

Tekton

curl https://tekton.ilkerispir.com/

ToDo List

  • Flowchart(Excalidraw)
  • K8s cluster
  • Monitoring system (Prometheus & Grafana)
  • Alert thresholds(min 3 alert)
  • Redeploy & reconfig(Iac(Ansible) example)
  • Certs & expiration dates service
  • CI/CD

Architectures

K8s Arch

k8s-arch

Monitoring, Alert, Time Series Database Tools

Top alert definitions for 3 metrics

  • CPU, RAM usage more than 80%
  • Pod, Node crashes for various reasons
  • Out of free space on the disk
  • APM Metrics
    • Error rates
    • Response Times
    • Uptime

Monitoring & Alert Arch

k8s-monitoring

Prometheus Example Alerts

kwatch

Grafana Example Dashboard

kwatch

Kwatch Example Alert Message

kwatch

IaC Example with Ansbile

  • Let's have a sample batch configuration change settings scenario on machines. Let's have 2 machines as in this K8s cluster (of course there may be more).
  • We can group these machines according to our wishes. For example, we can specify group names as all, master and node.
[all]
master                 ansible_host=192.168.33.71
node                   ansible_host=192.168.33.72

[master]
master

[node]
node
  • Then we can do different operations according to the roles in our Ansible playbook.
- hosts: all
  become: yes
  roles:
    - common

- hosts: master
  become: yes
  roles:
    - common

- hosts: node
  become: yes
  roles:
    - common
  • We can perform batch operations on the machine blog with the Ansible command I wrote in the example below
PYTHONUNBUFFERED=1 ANSIBLE_FORCE_COLOR=true ANSIBLE_HOST_KEY_CHECKING=false ANSIBLE_SSH_ARGS='-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s' ansible-playbook --connection=ssh --timeout=30 --user="vagrant" --limit="all" --inventory-file=./hosts --ask-pass --become -vvv ansible.yml
  • Ansible result ansible

K8s Cert Exp Service

service-arch

Certificate Management with kubeadm

kubeadm certs check-expiration

Determine SSL cert expiration date from a PEM encoded with openssl

openssl x509 -enddate -noout -in /etc/kubernetes/pki/apiserver.crt

K8s PKI certificates expire Service

k8s-certs-exp

Tekton Dashboard

tekton

Better solutions

Best Practice Infrastructure Arch

best-practice-infra

Multu Cloud & Region Arch

multi-cloud-region

CI/CD Pipeline

ci-cd