kubernetes-sigs/cluster-api

Optimize resource usage of our prowjobs

chrischdi opened this issue ยท 8 comments

Description

This is a tracking issue to go over all our prowjobs and optimise the set requests+limits for memory and CPU.

Reserving too much resources causes costs to the community which is why we should take a look and optimise our prowjob configurations.

The following dashboard from sig-test-infra helps to get a view on the current usage of the jobs:

https://monitoring-eks.prow.k8s.io/d/53g2x7OZz/jobs?orgId=1&var-org=kubernetes-sigs&var-repo=cluster-api&from=1712214148796&to=1712225862895

Sheet I did use to get to the new values: here

The following prowjobs are to be adjusted:

  • kubernetes/test-infra#32376
    • presubmits
      • pull-cluster-api-apidiff-main
      • pull-cluster-api-build-main
      • pull-cluster-api-verify-main
    • periodics:
      • periodic-cluster-api-e2e-conformance-ci-latest-main
      • periodic-cluster-api-e2e-conformance-main
      • periodic-cluster-api-e2e-dualstack-and-ipv6-main
      • periodic-cluster-api-e2e-main
      • periodic-cluster-api-e2e-mink8s-main
      • periodic-cluster-api-e2e-upgrade-1-24-1-25-main
      • periodic-cluster-api-e2e-upgrade-1-25-1-26-main
      • periodic-cluster-api-e2e-upgrade-1-26-1-27-main
      • periodic-cluster-api-e2e-upgrade-1-27-1-28-main
      • periodic-cluster-api-e2e-upgrade-1-28-1-29-main
      • periodic-cluster-api-e2e-upgrade-1-29-1-30-main
      • periodic-cluster-api-test-main
      • periodic-cluster-api-test-mink8s-main

In a second and/or third iteration we should address the following jobs too, by adapting values of the first round:

  • periodics
    • periodic-cluster-api-e2e-conformance-ci-latest-release-1-7
    • periodic-cluster-api-e2e-conformance-release-1-7
    • periodic-cluster-api-e2e-dualstack-and-ipv6-release-1-5
    • periodic-cluster-api-e2e-dualstack-and-ipv6-release-1-6
    • periodic-cluster-api-e2e-dualstack-and-ipv6-release-1-7
    • periodic-cluster-api-e2e-mink8s-release-1-5
    • periodic-cluster-api-e2e-mink8s-release-1-6
    • periodic-cluster-api-e2e-mink8s-release-1-7
    • periodic-cluster-api-e2e-release-1-5
    • periodic-cluster-api-e2e-release-1-6
    • periodic-cluster-api-e2e-release-1-7
    • periodic-cluster-api-e2e-upgrade-1-22-1-23-release-1-5
    • periodic-cluster-api-e2e-upgrade-1-23-1-24-release-1-5
    • periodic-cluster-api-e2e-upgrade-1-23-1-24-release-1-6
    • periodic-cluster-api-e2e-upgrade-1-24-1-25-release-1-5
    • periodic-cluster-api-e2e-upgrade-1-24-1-25-release-1-6
    • periodic-cluster-api-e2e-upgrade-1-24-1-25-release-1-7
    • periodic-cluster-api-e2e-upgrade-1-25-1-26-release-1-5
    • periodic-cluster-api-e2e-upgrade-1-25-1-26-release-1-6
    • periodic-cluster-api-e2e-upgrade-1-25-1-26-release-1-7
    • periodic-cluster-api-e2e-upgrade-1-26-1-27-release-1-5
    • periodic-cluster-api-e2e-upgrade-1-26-1-27-release-1-6
    • periodic-cluster-api-e2e-upgrade-1-26-1-27-release-1-7
    • periodic-cluster-api-e2e-upgrade-1-27-1-28-release-1-5
    • periodic-cluster-api-e2e-upgrade-1-27-1-28-release-1-6
    • periodic-cluster-api-e2e-upgrade-1-27-1-28-release-1-7
    • periodic-cluster-api-e2e-upgrade-1-28-1-29-release-1-6
    • periodic-cluster-api-e2e-upgrade-1-28-1-29-release-1-7
    • periodic-cluster-api-e2e-upgrade-1-29-1-30-release-1-7
    • periodic-cluster-api-test-mink8s-release-1-5
    • periodic-cluster-api-test-mink8s-release-1-6
    • periodic-cluster-api-test-mink8s-release-1-7
    • periodic-cluster-api-test-release-1-5
    • periodic-cluster-api-test-release-1-6
    • periodic-cluster-api-test-release-1-7
  • presubmits:
    • pull-cluster-api-e2e-blocking-main
    • pull-cluster-api-e2e-conformance-ci-latest-main
    • pull-cluster-api-e2e-conformance-main
    • pull-cluster-api-e2e-dualstack-and-ipv6-main
    • pull-cluster-api-e2e-main
    • pull-cluster-api-e2e-mink8s-main
    • pull-cluster-api-e2e-upgrade-1-29-1-30-main
    • pull-cluster-api-test-main
    • pull-cluster-api-test-mink8s-main
    • pull-cluster-api-apidiff-release-1-5
    • pull-cluster-api-apidiff-release-1-6
    • pull-cluster-api-apidiff-release-1-7
    • pull-cluster-api-build-release-1-5
    • pull-cluster-api-build-release-1-6
    • pull-cluster-api-build-release-1-7
    • pull-cluster-api-e2e-blocking-release-1-5
    • pull-cluster-api-e2e-blocking-release-1-6
    • pull-cluster-api-e2e-blocking-release-1-7
    • pull-cluster-api-e2e-conformance-ci-latest-release-1-7
    • pull-cluster-api-e2e-conformance-release-1-7
    • pull-cluster-api-e2e-dualstack-and-ipv6-release-1-5
    • pull-cluster-api-e2e-dualstack-and-ipv6-release-1-6
    • pull-cluster-api-e2e-dualstack-and-ipv6-release-1-7
    • pull-cluster-api-e2e-informing-release-1-5
    • pull-cluster-api-e2e-mink8s-release-1-5
    • pull-cluster-api-e2e-mink8s-release-1-6
    • pull-cluster-api-e2e-mink8s-release-1-7
    • pull-cluster-api-e2e-release-1-5
    • pull-cluster-api-e2e-release-1-6
    • pull-cluster-api-e2e-release-1-7
    • pull-cluster-api-e2e-upgrade-1-27-1-28-release-1-5
    • pull-cluster-api-e2e-upgrade-1-28-1-29-release-1-6
    • pull-cluster-api-e2e-upgrade-1-29-1-30-release-1-7
    • pull-cluster-api-test-mink8s-release-1-5
    • pull-cluster-api-test-mink8s-release-1-6
    • pull-cluster-api-test-mink8s-release-1-7
    • pull-cluster-api-test-release-1-5
    • pull-cluster-api-test-release-1-6
    • pull-cluster-api-test-release-1-7
    • pull-cluster-api-verify-release-1-5
    • pull-cluster-api-verify-release-1-6
    • pull-cluster-api-verify-release-1-7

/kind cleanup

/assign

/triage accepted

The following PR does the second round of optimisation by using the same values for all jobs:

I'll do another iteration after the next release is out.

/priority important-soon

I did take a look at the dashboards. It looks good I think for now. There's some buffer on the existing jobs but that one is not too much.

1 exception where we might want to adjust:
#10575 (comment)

/close

@chrischdi: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Thx, all good. Have to think about in which job to put the linter. Maybe it gets its own job