bitcoin-dev-project/warnet

bug: deploy-logging fails on Digital Ocean

Opened this issue · 3 comments

Steps to reproduce

  1. Set up DO cluster
  2. Install helm (brew install helm, in my case)
  3. Connect kubectl by loading the cluster config
  4. Run warcli cluster deploy
  5. Run warcli cluster deploy-logging

Helm version: version.BuildInfo{Version:"v3.15.3", GitCommit:"3bb50bbbdd9c946ba9989fbe4fb4104766302a64", GitTreeState:"clean", GoVersion:"go1.22.5"}

Logging will fail on the Prometheus step with:

Release "prometheus" does not exist. Installing it now.
Error: failed pre-install: warning: Hook pre-install kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/clusterrolebinding.yaml failed: 1 error occurred:
        * Post "https://37e2bef1-1500-4185-b71c-4a159794e63f.k8s.ondigitalocean.com/apis/rbac.authorization.k8s.io/v1/clusterrolebindings?fieldManager=helm": EOF

Initial thoughts

Shouldn't really be a cluster specific thing since it's all using helm underneath? My first thought is the cluster I set up perhaps doesnt have the right permissions (hinted at by the fact it's a role binding error). This could also be an API mismatch thing where the chart we are trying to use is using new/deprecated API features

Seems this might be timeout related per @m3dwards , I also noticed logging takes forever to tear down due to the Loki containers being stuck in a terminating status.

I got the timeouts too but after repeated tries was able to install. Perhaps log a support ticket with DO?

Will follow up on this after Tabconf. If we are planning to use GKE for Tabconf, then this shouldn't be a problem for now.