sharing infrastruture between deployments
mikedanese opened this issue · 15 comments
There is a certain amount of tooling that will benefit all deployment automations. We should discuss how to develop this tools in a way that they benefit as many of the maintained deployments as possible. Some work that I can think of that would benefit all deployments:
- Config revamp. It's hard to configure kubernetes components.
- Finish componentconfig kubernetes/kubernetes#12245
- Move as much config as possible into the master kubernetes/kubernetes#1627
- Create a cluster config for cluster wide configuration prameters kubernetes/kubernetes#19831
- Kubelet dynamic config kubernetes/kubernetes#27980
- Own the installation. Get kubelet and it's dependencies installed.
- Standardize on a node container image #34
- (or) build packagemanager packages (rpms and debs)
- commit to single binary
- Make pod network easy to deploy. It's hard to setup the pod network.
- Allow CNI node agents to run in DaemonSets on the host and configure the kubelet
- Revamp the addon manager. There is no standard way to deploy addons.
- Move addons out of kubernetes core repo
- Get rid of the bash addon manager, possibly replace with helm
- Implement kubectl apply --prune kubernetes/kubernetes#19805
- Initial client bootstrap. It's hard to setup secure communication between k8s components.
- Discovery API kubernetes/kubernetes#5754
- TLS Bootstrap kubernetes/kubernetes#18112
- Self hosting. Make it easy to run kubernetes on kubernetes.
- Daemonset upgrades kubernetes/kubernetes#22543 (comment)
- Easier self hosted control plane bootstrapping
- Kubelet checkpointing kubernetes/kubernetes#489
- Kubeclient should follow apiservers and do failover
- Kubeconfig v2
Obviously, some of these have higher relative priority than others. Let's use this issue to track the deployment shared infrastructure effort. Let me know if there are items on this list that are missing.
That list looks like a great place to start :-)
For the standardized node container image, did you mean to link to #40?
Nope, meant to link to #34, thanks.
@aaronlevy @philips are CoreOS folks interested in claiming any of these items for v1.4 (that you haven't already claimed). Is there anything you want to add to this list?
I am not sure I follow the goal with #34 , are you proposing rkt is the foundation run-time to bootstrap a node?
Can you make it clear - is this the single repo? Or something else? Can we please find one repo, and kill all the rest?
@derekwaynecarr there is no dependency on rkt although rkt can easily run that image. There's actually a minimal implemntation of what rkt fly does here the prepares the chroot to be run with systemd and the RootDirectory directive. The idea is to package kubelet and docker and all the various dependencies to standardize the execution environment across deployments. Obviously it would be optional to use this as the basis of your deployment automation but could help. #34 is a prototype of how to build such an artifact but I'm open to other mechanisms. Also i haven't gotten it to work yet due to a docker bug.
As far as "easier self-hosted control plane boostraping" -- what does this encompass in your view?
We have been working on https://github.com/coreos/bootkube as a means of demonstrating / running self-hosted control planes -- but I think a lot of this functionality could be removed (or simplified) given a writeable kubelet api (submit pods directly to kubelet which will be rectified with apiserver state/objects once available).
As far as checkpointing we would like to start helping with that work. In the self-hosted scenario this is important for a few reasons:
- Checkpointing configuration on a self-hosted kubelet means the "on-host" kubelet can track the configuration of the self-hosted copy so they don't become skewed. (e.g. I deploy initial flag sets, but over time these change. I would like to checkpoint this new configuration locally so I can recover state without access to an api-server. This is also true of a (non-selfhosted) kubelet which updates-in-place or eventually uses an api-object for its configuration (initial configuration should be replaced by checkpointed configuration that can be used prior to contacting apiserver).
- If all api-servers die simultaneously, there is no automated recovery. With checkpointing of pods (in this case an api-server) even if you only have a single api-server -- you can recover from local state.
However, I brought up checkpointing in sig-node and that CoreOS would like to start at least scoping out what this work would look like -- and it didn't seem like there was much interest in this. Maybe @vishh can provide some thoughts?
Alternative to doing this work upstream, we've been working on "user-space" checkpointing as a stop gap (making use of side-car containers / dumping flags to disk / podmaster like functionality).
One other thing that is still on the radar is multi-apiserver support in kubelet + kube-proxy. Lower priority if we can get checkpointing in place, but something still to consider.
cc @kubernetes/sig-cluster-lifecycle
@aronchick We're moving towards lots of repos. That's github's reality. However, we are trying to merge redundant efforts, which I assume was your main concern.
Some of this is in:
https://github.com/kubernetes/community/wiki/Roadmap:-Cluster-Deployment
@bgrant0607 sorry, yes - you are correct. The latter was my issue - as per the community meeting today, I think we all agree we should just choose one getting started center point for work, and focus our efforts there. I still don't understand why there's both kube-deploy and kube-anywhere, weren't those efforts already merged?
@aronchick min-turnup and kubernetes-anywhere were merged and min-turnup was removed from kube-deploy. kube-deploy holds other projects. It's the contrib for deployment automation.
@aaronlevy Take a look at kubernetes/kubernetes#27980 . That should hopefully solve the config management issues wrt. self-hosting.
@derekwaynecarr @mikedanese Just to clarify, rkt will be a transient dependency for bootstrapping just the kubelet right? #34 (comment)
@mikedanese - many of the items in your initial list are now under development and I'm not sure this issue is useful any longer. Can we close as obsolete?