Make setting hostPort on daemonset configurable in the helm chart.

Question

Make setting hostPort on daemonset configurable in the helm chart.

jo-carter opened this issue 2 years ago · 12 comments

Problem

Multiple NIC deployments cannot be deployed simultaneous if using daemonset with nodeports with the current helm chart due to hostPort being hard coded into the template pod spec for port 80 and port 443.

daemon-set-controller-template

Describe the solution you'd like
Make setting hostPort optional in the helm chart/helm values file.

Answer 1 · 2023-03-02T06:11:30.000Z

Hi @jo-carter thanks for reporting!

Be sure to check out the docs while you wait for a human to take a look at this 🙂

Cheers!

Answer 2 · 2023-03-07T11:16:30.000Z

Hi @jo-carter thanks for this suggestion. I'll bring this to the attention of the team to see how we will proceed. Will update you when I can!

Answer 3 · 2023-03-08T09:26:04.000Z

Hi @jo-carter can you provide us with more details of how you are deploying simultaneous Ingress Controllers? Are you deploying multiple replicas in a single node or are you deploying across multiple nodes?
Can you also provide details on your deployment environment as well as your helm values? This will help us better replicate the issue.

Thanks!

Answer 4 · 2023-03-08T15:19:41.000Z

Hi @shaun-nx we are deploying multiple ingress controllers, each with multiple replicas - across the same worker nodes (hence we get a hostPort clash).

The purpose of the multiple Ingress Controllers is segregation of workloads.

We are deploying using the Helm Ingress Operator on Openshift 4.10, deploying NIC to different namespaces. That project sources it's charts from this repo, hence I've attached the issue here.

Sure, let me know if you want the IO manifests, or I'm happy to provide one with NIC's chart alone if needed - the result will be the same, you cannot bind to the same hostPort twice.

Answer 5 · 2023-03-09T01:41:30.000Z

@jo-carter any specific reason why you're using daemonset? If you want to have multiple multiple pods per node you should use deployment as daemonsets are for running one instance of the pod per node.

Answer 6 · 2023-03-09T02:39:03.000Z

Hi @lucacome it's a single replica per node per ingress controller (for use with nodePorts) - it's only multiple pods per node due to the multiple ingress controller daemonsets that are scheduled

this as a daemonset - from the docs

this is for workload segregation, each ingress controller handles it's own subset of total applications.

This is also an issue if any other daemonset wishes to use 443 or 80 hostPorts too - outside the of my use case which is deploying multiple ingress controllers.

Answer 7 · 2023-03-09T07:57:58.000Z

Hi, I have the exact same need for the exact same reason. We currently debate if we want to patch the Daemonset after each helm upgrade via CI - but making the single non-configurable part of the original template customizable, would be the much better solution.

Thank you.

Answer 8 · 2023-03-09T20:43:01.000Z

One thing that we would like to better understand is why in this situation are you using a daemonset as opposed to a deployment?
Especially with multiple NGINX Ingress Controller 'deployments'.

(not that we might not address the issue)

Answer 9 · 2023-03-09T22:12:00.000Z

Hi @brianehlert

I fail to see how using daemonset is more unusual with multiple IC's as compared to using deployment. Is there an advantage that you see deployment would offer over daemonset ?

A daemonset is specifically designed for my use-case, scheduling 1 pod(from the set) from per node, which is required as we are using nodeports and intend to use local traffic policy for performance reasons.

The alternative is to hack together a fake version of daemonset with deployment plus topology-constraints and skew, which has it's own problems (and is significantly more complicated than using a daemonset for it's intended purpose).

Answer 10 · 2023-03-09T22:58:03.000Z

There is nothing I can add. We also followed the approach of using a deployment, just because it seemed to be the obvious way, judging by chart design. But we gave up trying to replicate ds behaviour, cause there always were cases with unintended effects or needing manual override.

We have a set of n nodes which are configured (network segment, dimensions, fw Integration) to run ICs only, each on different ports. Deploying n additional nodes for each Stack of ICs would be huge overhead. Just as mimicing ds functions. It would even be nice to disable the hostPort altogether so the service with its externalIPs and the already implemented options for custom ports could kick in properly.

Thank you.

Answer 11 · 2023-03-10T15:26:48.000Z

daemonset vs deployment is personal preference.
The difference is that daemonset forces one pod across all nodes. And for many customers this consumes more compute than is necessary.
There is no traffic routing benefit to daemonset.
Many customers also run deployment for the ability to dynamically scale the ingress layer.

Where daemonset absolutely matters is when you have applications that require a fully stateful connection from client to backend pod. As only a daemonset type deployment can guarantee that each hop in the path maintains integrity.

We have customers that run deployment for scaling flexibility but use anti-affinity to force spread across nodes. Still valid, and supports the dynamic scaling need.

Now, I won't forget that this depends on a dynamic disaggregation (aka load balancer) in front of your cluster(s). And that is not always available or feasible.

Honestly, we have customers doing all kinds of things - because that is what suits their needs. And I always want to understand the reasons why so that we can solve the problems in the better way for the most folks.

I would be happy for anyone that wants to, to pick this up and take a stab at it.

Answer 12 · 2023-03-10T19:21:12.000Z

Thanks @brianehlert

Just a side note - you can indeed use taints and tolerations, and node selectors with daemonset, just as you would do a deployment - to target scheduling on a subset of nodes.

A good write up here

You'll notice one is implicitly in-force for regular daemonsets to ensure that it does not schedule on control plane(master) nodes.

I'd say daemonset should be the defacto standard for NIC - with NGINX there is little reason to schedule more than 1 pod per node(per IC), given it's ability to scale vertically via the worker processes model.