/elife-flux-cluster

Definition of eLife's k8s cluster and deployments to it. Automatically applied via Flux.

Primary LanguageShell

eLife k8s/Flux Production Cluster

EKS cluster name: kubernetes-aws--flux-prod

Use this git repo to control the cluster state (no kubectl or helm cli action needed/wanted).

  • Flux will try to apply any yaml file in this repo to the cluster
  • HelmController allows use of helm charts
  • We currently have three Kustomizations defined: crds, system and deployments (each pointed at the root directory named the same). Only Yaml files found in these folders are loaded, in a dependency order (see "Kustomizations" below)

Cluster infrastructure is defined in builder in the kubernetes-aws section.

Admins can configure kubectl for this cluster with:

    aws eks update-kubeconfig \
       --name kubernetes-aws--flux-prod \
       --role arn:aws:iam::512686554592:role/kubernetes-aws--flux-prod--AmazonEKSUserRole

Dashboards

The #cluster-alerts slack channel receives alerts from:

  • Alertmanager
  • Healthchecks.io (monitors Alertmanager heartbeat)

Adding/Editing Deployments

Kustomizations

Cluster level Kustomizations

  • crds: Cluster managed CustomResourceDefinitions.

  • system: Cluster services that are not directly serving production usecases. Some infrastructure components needs CRDs to exist before upgrading, so infrastructure kustomization depends on crds kustomization

  • deployments: These are the production services. As these all depend on infrastructure to serve traffic correctly, system kustomization is a dependency of this kustomization

  • flux tries to apply any .yaml file in the kustomization directories above

  • within that root folder, the structure is only used for humans

  • namespaces are managed using .yaml files

  • flux will always apply the HEAD of master

Each namespace is organised around an application, or an environment for an application, favouring the latter.

Individual Kustomizations

There are a growing number of kustomizations for apps or system that abstract complexity. We can then deploy them with a flux Kustomization object from one of the cluster kustomizations above. These kustomizations are stored in kustomzations/.

Adding Helm Charts

Debugging

Services available on the Cluster

  • nginx-ingress (docs)
    • provides SSL termination
    • host entries ending in .elifesciences.org will be added to our zone by ExternalDNS
  • cert-manager with letsencrypt (docs/letsencrypt)
    • obtain letsencrypt SSL certs via ingress definitions
  • PrometheusOperator (docs/monitoring-alerting)
  • oauth2-proxy (docs/oauth-proxy)
    • limit access to elifesciences github org
  • SealedSecrets (docs/sealed-secrets.md)
    • encrypt secrets for safe storage in this repo
  • Loki
    • Stores logs for services in cluster, is queriable from Grafana as a data source.
  • Percona Server for MongoDB operator
    • Used to run a MongoDB cluster, with support for automated backup, reconvery and upgrades.
    • Deployed in "cluster-wide" mode. Each namespace can deploy it's own cluster of pods from the central operator.

Administration