netdata/helmchart

Incompatible with non-nginx ingress

ohthehugemanatee opened this issue · 3 comments

After a long discussion (and lots of help!) on discord I've identified an incompatibility with ingress other than nginx.

In my case (running k3s with its default ingress, traefik), here's what happened:

  • netdata.k8s.local was resolving correctly to the cluster1 node, which also does ingress for me.
  • BUT traefik (unlike nginx) does not proxy arbitrary ports. It's strictly an HTTP/S proxy and only does ports 80/443.
  • So netdata.k8s.local:19999 was going to the service with hostmapped port 19999 open... the netdata child running on cluster1.

This was very confusing, especially since the UI difference is relatively subtle. Apart from the missing nodes in the left side panel, I could only identify that I was in the "child" pod from the page title.

  • anyone running k3s... or really anyone who uses non-nginx for ingress. Do you have that in a testbed yet?
  • anyone with ingress hosted on a schedule-able node.

The fix I applied:

  • Set the ingress annotation to an empty string "", which falls back to the default resolver
  • Add an annotation to redirect http to https (traefik.ingress.kubernetes.io/redirect-entry-point: https) (ok, this is a nice-to-have)
  • Remove the port number from the URL I was using.

It would also help a lot if the parent listened on a port other than 19999.

Suggested changes for the project:

  • Empty the ingress string. For people using nginx, they don't need it explicitly specified. For people without nginx, it was broken.
  • Have the ingress listen on port 80/443 by default. This will be compatible with more ingresses - probably all of them.

First of all thanks @ohthehugemanatee for creating the issue because this is actually quite important thing you mentioned here.

Warning here - this is not going to be a quick reply. I would also like to make a discussion out if it, meaning that I would like to hear some of your thoughts about what I will write in next… thousand lines or something :) have your ☕ ? let's go!

I have to mention that helm chart for Netdata is made in a way that we want to have as much users as we can, starting journey with our software with as little hassle as possible. You probably get the idea - we want you all to test it and see, if it is the right product for you quickly (SURE IT IS 😉 ). It does not mean however, that we can have all of users with perfect production setup with no configuration whatsoever.

I took my time to play with it, so let’s go with the first one - the problem with kubernetes.io/ingress.class annotation.
I really liked the idea for it to point by default to ””, it seemed to me like a perfect solution really. Unfortunately looks like it is not a magic cure in and of itself. We have quite a bit of Ingress Controllers to choose from. First part of this doc states that Kubernetes as a project supports and maintains only 3 of them, that is AWS, GCE and nginx. It so happens that fair bit of Kubernetes users (yes not all of them) are using nginx controller. Also choosing the last option from this short list, frees Netdata from “vendor lock” on some cloud provider. So this is how it came to be that we have nginx there as a default one. The value off ”” is a bit of a dead end for few reasons - not big but there are differences between various ingress controllers, for example pathType parameter - nginx works with it set to Prefix while GCE wants there to have ImplementationSpecific refusing to accept value Prefix.
You are clearly an experienced user of Kubernetes judging by what you have wrote and because of that you also have your own preferred solution for ingress controller, thus I would say that some configuration indeed is needed to include (any) new software into your solution of choice.

The port problem is a bit easier from my point of view, or am I getting something wrong? Port parameter is at your disposal all the time, so you can change the port. Thing with default value is that many of our users are already having theirs setups done by using the 19999 port. Changing our default value now can do more charm than good, I honestly am a bit worried about changing it.

Please do let me know what do you think. I still am trying to find some middle ground for this but other than giving more parameters to set in values.yaml to ease the pain of configuration I have nothing. We also have to add ingress class there for nginx (I will do it).

@ohthehugemanatee can you comment on what I wrote?

closing after #244 was merged