canonical/microk8s-core-addons

observability fails to enable on HA cluster

Closed this issue · 0 comments

Summary

I am trying to enable the new observability addon.

MicroK8s v1.25.0 revision 3883

microk8s status

microk8s is running
high-availability: yes
  datastore master nodes: 192.168.1.10:19001 192.168.1.11:19001 192.168.1.12:19001
  datastore standby nodes: none
addons:
  enabled:
    dashboard            # (core) The Kubernetes dashboard
    dns                  # (core) CoreDNS
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    ingress              # (core) Ingress controller for external access
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
    rbac                 # (core) Role-Based Access Control for authorisation
    storage              # (core) Alias to hostpath-storage add-on, deprecated
  disabled:
    cert-manager         # (core) Cloud native certificate management
    community            # (core) The community addons repository
    gpu                  # (core) Automatic enablement of Nvidia CUDA
    host-access          # (core) Allow Pods connecting to Host services smoothly
    kube-ovn             # (core) An advanced network fabric for Kubernetes
    mayastor             # (core) OpenEBS MayaStor
    metallb              # (core) Loadbalancer for your Kubernetes cluster
    observability        # (core) A lightweight observability stack for logs, traces and metrics
    prometheus           # (core) Prometheus operator for monitoring and logging
    registry             # (core) Private image registry exposed on localhost:32000

I have a 3 node cluster.

microk8s enable observability

Infer repository core for addon observability
Addon core/dns is already enabled
Addon core/helm3 is already enabled
Addon core/hostpath-storage is already enabled
Enabling observability
"prometheus-community" already exists with the same configuration, skipping
"grafana" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "openebs" chart repository
...Successfully got an update from the "prometheus-community" chart repository
...Successfully got an update from the "grafana" chart repository
Update Complete. ⎈Happy Helming!⎈
Release "kube-prom-stack" does not exist. Installing it now.
Error: Endpoints "kube-prom-stack-kube-prome-kube-scheduler" is invalid: [subsets[0].addresses[0].ip: Invalid value: "192.168.1.10,192.168.1.11,192.168.1.12": must be a valid IP address, (e.g. 10.9.8.7 or 2001:db8::ffff), subsets[0].addresses[0].ip: Invalid value: "192.168.1.10,192.168.1.11,192.168.1.12": must be a valid IP address]

What Should Happen Instead?

The addon should be enabled.

Reproduction Steps

  1. I was running microk8s 1.24 and enabled the prometheus addon.
  2. I then disabled the plugin and upgraded the HA cluster to 1.25.
  3. Tried to enable the observability addon.

Introspection Report

Inspecting system
Inspecting Certificates
Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-kubelite is running
  Service snap.microk8s.daemon-k8s-dqlite is running
 FAIL:  Service snap.microk8s.daemon-apiserver-proxy is not running
For more details look at: sudo journalctl -u snap.microk8s.daemon-apiserver-proxy
  Service snap.microk8s.daemon-apiserver-kicker is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy openSSL information to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster
Inspecting dqlite
  Inspect dqlite

sudo journalctl -u snap.microk8s.daemon-apiserver-proxy -f

-- Logs begin at Tue 2022-07-19 00:32:22 UTC. --
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + ARCH=x86_64
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + export LD_LIBRARY_PATH=/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void::/snap/microk8s/3883/lib:/snap/microk8s/3883/usr/lib:/snap/microk8s/3883/lib/x86_64-linux-gnu:/snap/microk8s/3883/usr/lib/x86_64-linux-gnu
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + LD_LIBRARY_PATH=/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void::/snap/microk8s/3883/lib:/snap/microk8s/3883/usr/lib:/snap/microk8s/3883/lib/x86_64-linux-gnu:/snap/microk8s/3883/usr/lib/x86_64-linux-gnu
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + source /snap/microk8s/3883/actions/common/utils.sh
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: ++ [[ /snap/microk8s/3883/run-apiserver-proxy-with-args == \/\s\n\a\p\/\m\i\c\r\o\k\8\s\/\3\8\8\3\/\a\c\t\i\o\n\s\/\c\o\m\m\o\n\/\u\t\i\l\s\.\s\h ]]
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + '[' -e /var/snap/microk8s/3883/var/lock/clustered.lock ']'
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + echo 'Not a worker node, exiting'
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: Not a worker node, exiting
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + exit 0
Sep 12 15:07:17 kubey systemd[1]: snap.microk8s.daemon-apiserver-proxy.service: Succeeded.

To clean up the half enabled addon I have to run this.
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#configuration

microk8s.helm uninstall kube-prom-stack -n observability

kubectl delete crd alertmanagerconfigs.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd probes.monitoring.coreos.com
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com