kelseyhightower/kubestack

provision master and node between different zones

Closed this issue · 7 comments

Would be good to provision master and node between different zones,
as to use one zone for the whole cluster is not a good idea.

For testing purposes it is fine, but people will start running production clusters :)

@rimusz What do you think about the idea of multiple clusters, one in each zone? Depending on the network boundaries between zones, it may offer a better experience to to have multiple clusters vs on big one.

@kelseyhightower I'm not sure that is a good idea.
From my experience when you have something bad going on in the zone, your whole cluster
stops working.

Also then you have no HA for you etcd cluster and your nodes.
Specially if you want to run more than one k8s pod for some website .

Plus it makes more expensive when you have more 3x etcd/ k8s master servers sets
and so on.

So you end up with ubernetes federation even for the small e.g. 20 nodes clusters setup

I never ever trust one zone setup, bad experience in the past.

@kelseyhightower We might have zones listed in variables.tf e.g.:

variable "zone1" {
    default = "us-central1-a"
}
variable "zone2" {
    default = "us-central1-b"
}

or terraform.tfvars (I'm new to terraform)

and then have zone1.tf and etc files with the worker_count or something like that

The current "best practice" for cabernets seems to be to run a cluster per zone. The nodes chat with the master often, so they generally should be in the same failure domain.

I have no problems running coreos and kubernetes clusters spread around different zones in the same region.
A couple time one of zones had issues, nothing was working there, but that did not affect my setup much.

Not sure that it is the 'best practice' to run cluster per zone.

If you do multiple zones, for etcd, you need to use an odd number of zones to make sure a simple majority is maintained - also may need to adjust timeouts. I've only done this in AWS, however.

I think I got an idea about multi zones setup.
Running etcd clusters on GCE no timeouts adjustments are needed. but Aws multi zone network latency is much worse, had many issues there. happy with my move to GCE