Server resource request for running Metal3.io CI
Opened this issue · 8 comments
First and Last Name
Kashif Khan
Company/Organization
Ericsson Software Technology
Job Title
Product Owner
Project Title (i.e., a summary of what do you want to do, not what is the name of the open source project you're working with)
Project Name: Metal3.io (https://metal3.io/ , https://github.com/metal3-io)
Plans to utilize the hardware for:
- Run CI pipelines for Metal3.io project (e2e integration and feature tests, prow jobs etc)
- Maintainers to use this for manual tests (tests mentioned in previous bullet).
Briefly describe the project (i.e., what are the details of what you're planning to do with these servers?)
- Metal3.io Project's is an open-source solution for bare metal provisioning for kubernetes clusters and run cloud native workloads on them.
- Metal3.io is a CNCF sandbox project
- Plan is to utilize the hardware resources for running CI jobs and prow clusters for maintaining the open source code base across repositories in the organization
- Most of the CI jobs require to have better server resources for example one e2e integration jobs requires 4Cores-16GB RAM-100GB disk and one feature test requires 8cores- 32GB RAM - 300GB disks. And we have several of these e2e integration and feature tests.
Is the code that you're going to run 100% open source? If so, what is the URL or URLs where it is located? What is your association with that project?
Yes the code is 100% open source . Here is again the link of the project's github organization https://github.com/metal3-io. I am one of the maintainer. Here is the list of the other maintainers: https://github.com/metal3-io/community/blob/main/maintainers/ALL-OWNERS
What kind of machines and how many do you expect to use (see: https://deploy.equinix.com/product/bare-metal/servers/)?
6 servers of the type m3.small.x86
What operating system and networking are you planning to use?
CentOS and Ubuntu
Any other relevant details we should know about?
It would be good to allow access to all the maintainers mentioned here https://github.com/metal3-io/community/blob/main/maintainers/ALL-OWNERS
@kashifest before we'll proceed, can't you use GitHub Actions for the CI purposes instead (as the CIL resources can be limited) - https://github.com/cncf/cluster?tab=readme-ov-file#usage-guidelines
@kashifest before we'll proceed, can't you use GitHub Actions for the CI purposes instead (as the CIL resources can be limited) - https://github.com/cncf/cluster?tab=readme-ov-file#usage-guidelines
@idvoretskyi we usually run github workflows for smaller jobs, linters etc, usually our CI is resource intensive and thus dedicated hardware would be needed.
P.S: in the link provided, when I click this link I dont see any example as it says see example below. Any idea where can I get more info on this?
@jeefy @idvoretskyi we are meanwhile testing the github actions with large runners as you suggested, unfortunately the equinix ones are offline and not usable at the moment. We used the default large runners for one of our e2e tests, we see that the usage shows some billable hours, any idea what does that mean to us? Is the bill going to CNCF since our github org is part of CNCF enterprise and we dont need to worry about it? We are still unsure how much would this be a performance issue once we start putting the e2e PR jobs on these runner cause I can see that getting an available runner is random, For our other repos the e2e tests are bigger in resource requirements so for those we would anyhow need some dedicated resource.
@jeefy @idvoretskyi any updates?
@kashifest yep, apologies for the delay. Working on this internally.
re: earlier comment, no worries about the billable hours with metal3's Org being under the CNCF GH Enterprise license.
The Equinix and Oracle runners should be available for use if you wouldn't mind trying those again. :) If you run into issues or need a different shape/size please let us know.
If those don't work we can set you up in Equinix, we'll just need you to only use resources when needed, so spinning them up/down automatically.
@jeefy Thanks a lot for replying back.
The Equinix and Oracle runners should be available for use if you wouldn't mind trying those again. :) If you run into issues or need a different shape/size please let us know.
I dont think its working or then I might have configured something wrongly. The job is waiting for 40 minutes or so for runner to become available. Check the attachment
Here is the PR: metal3-io/baremetal-operator#1775
If those don't work we can set you up in Equinix, we'll just need you to only use resources when needed, so spinning them up/down automatically.
I believe this would be needed anyhow for larger e2e jobs in our org anyhow.