kubernetes/kubeadm

Improvements to the handling of cluster addons

neolit123 opened this issue · 27 comments

this is a tracking issue for addon improvements.
it has a proposal that possibly needs discussion.

a couple of problems:

  1. addons rely on the same image repository as the control plane images.

addons don't follow the k8s versioning and the users don't have control of what version of an addon is pulled.

  1. addon manifests are hardcoded in kubeadm and the users need to patch the cluster if they need to apply different specs - (anti-)affinity rules or nodeSelectors etc.

these complexities can be solved if addons are introduced in a separate structure in the kubeadm configuration.

rough pseudo example:

ClusterConfiguration:
...
  addons:
  - name: coredns
    image: <some-image-url>
    manifestFile: <some-path-to-manifest>
  - name: kube-proxy
    image: <some-image-url>
    manifestFile: <some-path-to-manifest> # inlining
  - name: <some-other-addon?>
    image: <some-image-url>
    manifestFile: <some-path-to-manifest> # inlining

notes:

  • core addons can have special handling if needed, possibly reserving some names for such.
  • manifest files can contain --- separated list of objects. Deployment, ConfigMap etc.
  • if minifestFile is not defined use the defaults (hardcoded) values.
  • if image is not defined use defaults.

/cc @chuckha @fabriziopandini

edit1: it was also proposed to have restore-default-at-upgrade as a possible knob per addon element.
#1091 (comment)

rosti commented

I like the idea. A couple of notes though:

  • image should probably be images and have an array of images. This covers the kube-dns case where 3 images are needed.
  • Stuff stored in the manifest files should be templated. A standard structure for addon templates should be created (probably containing stuff like the images, addon name, ClusterConfiguration, node stuff). This way we can standardize most of the addon specific code and reduce (or even remove) that code.

Should the images/repository field go in the manifest file instead of outside the custom configurations? If not, what makes the images/repository more special than the configuration of the plugin?

rosti commented

If we use an array of images, we can handle the case with non-obvious dependencies (like in the kube-dns case) and still have a fully functional image pre-pulling. However, if we can use only a single image there, then I agree with you, that this is pointless.

currently the manifests include types like DeamonSet, Service, Deployment, which are API types.
if we want the image list to be defined in a file, then we need to define a custom structure.

@neolit123 thanks for kicking the ball on this topic
Some personal thoughts about this thread

  • kubeadm should manage only two addons: a proxy and a dns (currently in two variants). For all the other addons there are already plenty of alternatives kubectl apply, helm, kustomize etc
  • kubeadm should support add-on customizations up to a certain limits (e.g images, cluster wide component config + eventually node specific extra-args). Those limits defines also the perimeter of kubeadm test grids
  • For use cases behind above limits (e.g. manifest customizations) it should be possible to skip the addon phase and "do it by yourself"

kubeadm should manage only two addons: a proxy and a dns (currently in two variants). For all the other addons there are already plenty of alternatives kubectl apply, helm, kustomize etc

kubeadm should support add-on customizations up to a certain limits (e.g images, cluster wide component config + eventually node specific extra-args). Those limits defines also the perimeter of kubeadm test grids

ok, these two are some good points.

For use cases behind above limits (e.g. manifest customizations) it should be possible to skip the addon phase and "do it by yourself"

this also makes perfect sense, except all the reports i've seen about this feature were about using custom changes in addon manifests with kubeadm init. phases are powerful, but it seems that the users want this from kubeadm init, because it's the simpler UX.

it seems that the users want this from kubeadm init, because it's the simpler UX.

That's true and I'm fine in pushing the limit of kubeadm supported configurations as far as we can follow with a good test grid coverage and/or we can reasonably ensure stable and predictable workflows.

@neolit123 @fabriziopandini I think we should support customization of add-on version to let users use the version which they want. This is useful and simple. A newer version can include many fixes...

A caveat with version selection: CoreDNS for example typically has a new default configuration with every release. This configuration is not built in the binary - it's essentially hard coded in kubeadm. Therefore (theoretical example) if installing CoreDNS 1.2.2 with kubeadm 1.11, you'd get the CoreDNS 1.1.3 default k8s deployment configuration. So this would be a "use at your own risk" option.

i see the common use case being:

addons:
  - name: coredns
  - name: kube-proxy

this would install the two addons, based on hardcoded configuration.
which we probably need to expose on the command line AKA config --print-default [1]

changing the kube-proxy image might not make a lot of sense because it's bound to k8s versioning, but this still allows manifestFile for tweaks.

@chrisohaver

A caveat with version selection: CoreDNS for example typically has a new default configuration with every release. This configuration is not built in the binary - it's essentially hard coded in kubeadm. Therefore (theoretical example) if installing CoreDNS 1.2.2 with kubeadm 1.11, you'd get the CoreDNS 1.1.3 default k8s deployment configuration. So this would be a "use at your own risk" option.

we have the option to make it so that if the user sets a different image for an addon a custom manifest file will be mandatory. but yes, they would be on their own if they want to use a custom image/config and we probably need to expose the default one as outlined in [1].

rosti commented

We should also consider, that allowing the user to have a custom config map (for example), might make upgrade impossible, to the point of breaking it.

The user may be required to do manual upgrade of the addons if he/she used some config map option, that was changed and could not be upgraded automatically by kubeadm.

On the other hand, allowing the use of the latest patch release might be desirable, since those are not expected to change configs or APIs, but fix bugs.

i can definitely see the complication with upgrades, but again the users are on their own if they modify something beyond recognition. this is the case in most software upgrade scenarios.

we can potentially diff out deviations from the default addon manifests, print them out, but leave it to the users to manually patch.

On the other hand, allowing the use of the latest patch release might be desirable, since those are not expected to change configs or APIs, but fix bugs.

Yes, CoreDNS policy is that 3rd dot releases wont introduce any config incompatibilities (e.g. 1.0.0 -> 1.0.1). 2nd dot release (e.g. 1.0.1 -> 1.1.0) means there is the possibility of config incompatibilities - although the default config may not be affected.

Perhaps off topic: As it is now, how are we handling coredns corefile/configmap during upgrades? Do we try to detect if the corefile is at a prior default, and if so update it to the new default? Or do we always replace (clobbering customizations). Or do we never replace? I recall discussing ways to solve, such as putting a comment in the corefile, or label in configmap (e.g. restore-default-at-upgrade=true) - leaving it to the user to control. I suppose we could also test the existing corefile to see if it's valid on the new version, and abort/warn user if it is not.

@chrisohaver

Perhaps off topic: As it is now, how are we handling coredns corefile/configmap during upgrades?

as in my previous comment:

we can potentially diff out deviations from the default addon manifests, print them out, but leave it to the users to manually patch.

my initial idea for this would be to tell the user something in the lines of "hey, we see that you have deviated your addon configuration from the default one like so..." and pretty much diff it.

on upgrades we show this message "kubadm is now upgrading addon X to the default configuration
for this new version and you need to perform manual actions to patch your manifest".

automatic handling might be out of scope / hard to do.

@chrisohaver @neolit123

Perhaps off topic: As it is now, how are we handling coredns corefile/configmap during upgrades?

Currently, the corefile/configmap remains untouched during an upgrade if an existing coredns configmap is detected.

@rajansandeep thanks, thats what I was asking.

So, if left unaddressed, we will eventually have issues of config incompatibles during normal upgrades, where users never change their config from the default.

@chrisohaver @rajansandeep
sorry, i've missed the context of the off-topic.

the current handling is not optimal (AKA missing).

I recall discussing ways to solve, such as putting a comment in the corefile, or label in configmap (e.g. restore-default-at-upgrade=true) - leaving it to the user to control

i may have missed these discussions.
this should be per addon configuration and not only for coredns, it seems.

proposal:

  • by default always hard-replace the old config with the new default?
    restore-default-at-upgrade=false is a possible knob here.
  • look for deviations in the old config, warn about them on upgrade.
  • require manual actions from users.

ideas are welcome.

I suppose we could also test the existing corefile to see if it's valid on the new version, and abort/warn user if it is not.

would this validation process, require vendoring coredns in kubeadm?

@chrisohaver @rajansandeep
For kubelet and kube-prkxy kubeadm relies on component config for having a clean migration path for configuration files. This assumes that the owner of the config file, e.g. the. kubelet team, provides conversion functions from config vX to config vY
Are you aware of any config conversion functions provided by the coreDNS team?

Are you aware of any config conversion functions (config vX to config vY) provided by the coreDNS team?

This doesn't exist yet. The complexity/flexibility of the CoreDNS config makes this challenging to do 100% cleanly.

We had some discussions about this a few months ago, and recently starting talking about it again.

@chrisohaver ack.If there is a issue to follow such discussion I will add some comments about how to make this usable from kubeadm.

in the meantime, if the proposal for addon customization is integrated in kubeadm before coredns gets the conversation and validation code in place we might have to go with the approach of instructing the users to do manual patching. this is extra handling, but it might be a good idea to diff the user deviations for addons regardless.

i would like to remind that the demand for this feature seems high and if anyone notices a kubeadm issue in the lines of allow kubeadm to set [custom image | manifest feature X] for addon Y just close -> redirect here.

In the office hours yesterday we discussed need to figure out the balance here with bundles. https://github.com/GoogleCloudPlatform/k8s-cluster-bundle

I don't want to overfit the config, only to change it at a later date.

sorry for not attending - meetings and travel this week.

what we know about bundles at the moment is only the theory.
we can only have an estimate that they suit our needs once we test them.

i'm considering doing that, but the project doesn't have documentation and examples:
https://github.com/GoogleCloudPlatform/k8s-cluster-bundle
or at least i cannot find them.

@roberthbailey @justinsb do you have any examples for usage of the c-bundle or perhaps i should log an issue about that?

we already have coredns and kubeproxy addons in the config.
it was decided that we should not expose the manifest directly as the OP suggests.
this renders this tracking issue misleading and i'm closing it.

some related topics:

  • we still need to investigate cluster bundles
  • we have plans for a feature for modifying running clusters
  • impact on upgrades

as a side note kubectl now has kustomize built in.

rdxmb commented

May I ask: What is the recommended way to patch those addons like

  • Add anti-affinity to coredns to run it on different nodes by default?

Are there any docs for doing that? Thanks and kind regards.

EDITED: fix typo above

rdxmb commented

Ok, thanks to @neolit123 .

I will try to get the manifest with kubectl get and modify/deploy it again with kustomize.