Adjust node pools in data-processing clusters

Question

Adjust node pools in data-processing clusters

Opened this issue 4 years ago · 6 comments

The most recent change to etl k8s configs failed to launch pods in staging, because I had set the 8 core node-pool to 0 instances.

Also, in all projects, the utilization is low, because there are more nodes than needed for the requested pods.

All the pools should be updated to use appropriate auto-scaling configs.

Answer 1 · 2021-04-06T20:44:49.000Z

Today I am:
changing sandbox default pool to allow 1 node per zone, and remove zone us-east1-d
changing staging 8 core parser-pool1 to allow 0-2 nodes per zone.
changing staging 4 core parser-pool to allow 0-1 nodes per zone.

Tomorrow, I intend to change prod to set up auto-scaling for parser, default, and gardener pools.

Answer 2 · 2021-04-06T20:46:19.000Z

removing zone us-east1-d from the default pool in mlab-sandbox made gardener unschedulable, because of persistent volume location. Restored us-east1-d, and adjusted auto-scaling to allow 1-2 per zone.

Later changed to 0-2 per zone

Answer 3 · 2021-04-06T21:02:20.000Z

Subsequently updated mlab-sandbox parser-pool1 to allow 0-2 nodes per zone as well.

Answer 4 · 2021-04-12T17:09:05.000Z

@gfr10598 can you enumerate the steps that are needed before the next gardener release to production?

Answer 5 · 2021-04-21T14:55:25.000Z

I just deleted the mlab-staging parser-pool1 node pool from data-processing cluster. K8S had restarted parsers, and they were instantiated in the wrong node pool. Deleting the pool should prevent this happening again.

Answer 6 · 2021-04-21T15:03:30.000Z

See m-lab/etl#985 related to propagating errors from etl to gardener.