ocgi/spec

k8s upgrade strategy

Closed this issue · 6 comments

axot commented

Thanks for sharing such great project solves many Agones pain points.
We have a question is, how the behavior when we perform k8s upgrade,
It is possible to build a HA architecture, so we could achieve zero downtime during k8s upgrade.

@axot Thank you for your attention. Does K8s upgrade here refer to K8s pod updating, or K8s components, such as kube-apiserver upgrade?

axot commented

Hi, it include both control plane(Master/apiserver) and data plane(Node/Pod) upgrade we have to concern about.

With High-Availability Masters, we can upgrade the any K8s components online. Of course, the master node MUST be upgraded one by one.
For game application, a Squad consist of multiple DS replicas, we can rolling updating the replicas based on the Application Interactive Update feature, and don't need to stop serving for game players.

axot commented

Hi, how about carrier controller? I think it will failover to the new leader in upgrade. So may has few seconds delayed response for such scale game server?

Yeah, current the master lease is 15 seconds, which may cause a few seconds scaling delay when carrier upgrade.

@axot Would you like to try kruise-game (https://github.com/openkruise/kruise-game)