observability fails to enable on HA cluster
Closed this issue · 0 comments
monotok commented
Summary
I am trying to enable the new observability addon.
MicroK8s v1.25.0 revision 3883
microk8s status
microk8s is running
high-availability: yes
datastore master nodes: 192.168.1.10:19001 192.168.1.11:19001 192.168.1.12:19001
datastore standby nodes: none
addons:
enabled:
dashboard # (core) The Kubernetes dashboard
dns # (core) CoreDNS
ha-cluster # (core) Configure high availability on the current node
helm # (core) Helm - the package manager for Kubernetes
helm3 # (core) Helm 3 - the package manager for Kubernetes
hostpath-storage # (core) Storage class; allocates storage from host directory
ingress # (core) Ingress controller for external access
metrics-server # (core) K8s Metrics Server for API access to service metrics
rbac # (core) Role-Based Access Control for authorisation
storage # (core) Alias to hostpath-storage add-on, deprecated
disabled:
cert-manager # (core) Cloud native certificate management
community # (core) The community addons repository
gpu # (core) Automatic enablement of Nvidia CUDA
host-access # (core) Allow Pods connecting to Host services smoothly
kube-ovn # (core) An advanced network fabric for Kubernetes
mayastor # (core) OpenEBS MayaStor
metallb # (core) Loadbalancer for your Kubernetes cluster
observability # (core) A lightweight observability stack for logs, traces and metrics
prometheus # (core) Prometheus operator for monitoring and logging
registry # (core) Private image registry exposed on localhost:32000
I have a 3 node cluster.
microk8s enable observability
Infer repository core for addon observability
Addon core/dns is already enabled
Addon core/helm3 is already enabled
Addon core/hostpath-storage is already enabled
Enabling observability
"prometheus-community" already exists with the same configuration, skipping
"grafana" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "openebs" chart repository
...Successfully got an update from the "prometheus-community" chart repository
...Successfully got an update from the "grafana" chart repository
Update Complete. ⎈Happy Helming!⎈
Release "kube-prom-stack" does not exist. Installing it now.
Error: Endpoints "kube-prom-stack-kube-prome-kube-scheduler" is invalid: [subsets[0].addresses[0].ip: Invalid value: "192.168.1.10,192.168.1.11,192.168.1.12": must be a valid IP address, (e.g. 10.9.8.7 or 2001:db8::ffff), subsets[0].addresses[0].ip: Invalid value: "192.168.1.10,192.168.1.11,192.168.1.12": must be a valid IP address]
What Should Happen Instead?
The addon should be enabled.
Reproduction Steps
- I was running microk8s 1.24 and enabled the prometheus addon.
- I then disabled the plugin and upgraded the HA cluster to 1.25.
- Tried to enable the observability addon.
Introspection Report
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
FAIL: Service snap.microk8s.daemon-apiserver-proxy is not running
For more details look at: sudo journalctl -u snap.microk8s.daemon-apiserver-proxy
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting dqlite
Inspect dqlite
sudo journalctl -u snap.microk8s.daemon-apiserver-proxy -f
-- Logs begin at Tue 2022-07-19 00:32:22 UTC. --
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + ARCH=x86_64
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + export LD_LIBRARY_PATH=/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void::/snap/microk8s/3883/lib:/snap/microk8s/3883/usr/lib:/snap/microk8s/3883/lib/x86_64-linux-gnu:/snap/microk8s/3883/usr/lib/x86_64-linux-gnu
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + LD_LIBRARY_PATH=/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void::/snap/microk8s/3883/lib:/snap/microk8s/3883/usr/lib:/snap/microk8s/3883/lib/x86_64-linux-gnu:/snap/microk8s/3883/usr/lib/x86_64-linux-gnu
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + source /snap/microk8s/3883/actions/common/utils.sh
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: ++ [[ /snap/microk8s/3883/run-apiserver-proxy-with-args == \/\s\n\a\p\/\m\i\c\r\o\k\8\s\/\3\8\8\3\/\a\c\t\i\o\n\s\/\c\o\m\m\o\n\/\u\t\i\l\s\.\s\h ]]
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + '[' -e /var/snap/microk8s/3883/var/lock/clustered.lock ']'
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + echo 'Not a worker node, exiting'
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: Not a worker node, exiting
Sep 12 15:07:17 kubey microk8s.daemon-apiserver-proxy[459835]: + exit 0
Sep 12 15:07:17 kubey systemd[1]: snap.microk8s.daemon-apiserver-proxy.service: Succeeded.
To clean up the half enabled addon I have to run this.
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#configuration
microk8s.helm uninstall kube-prom-stack -n observability
kubectl delete crd alertmanagerconfigs.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd probes.monitoring.coreos.com
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com