thrumdev/blobs

Self-hosted runner

Closed this issue · 2 comments

I've been playing a bit with GHA self-hosted runners and it seems that the transition is not as smooth as one might expect.

So here are the options that I've found and might be considered:

  1. A full-featured orchestrator (based on Kubernetes), community-driven (complicated, secure, scales well): https://github.com/actions/actions-runner-controller
  2. A dedicated server with a bunch of sysbox-runtime docker containers running (relatively simple, secure, fixed number of runners): https://github.com/docker/github-actions-runner
  3. Github-hosted large machines / connecting Azure subscription (easy, secure, scales well, but I bet pricey).

My suggestion would be to go with option (2), since it's easy to start with and doesn't have many drawbacks. https://github.com/nestybox/sysbox/ runtime should ensure that whatever job is running, it will not be able to escalate privileges and take over the host machine. However the solution does not scale dynamically, i.e. it will be possible to only run a fixed number of concurrent jobs and the rest will need to wait in the queue.
However the build process without any optimisations is down to around 7mins on EX44 Hetzner machine (which I believe is a good trade-off between value and needed performance).

Minimal changes to the build process and the results running in self-hosted runner can be found here:
https://github.com/tomusdrw/thrumdev-blobs/pull/1/files

Optimizing the build itself.

Going with self-hosted runner requires at least some tiny changes to the build script, since I didn't find ready-to-use docker images that would correspond to GHA VM images like ubuntu-latest.
This basically means adding a few tools to be installed at the start (build-essential, libclang-dev, rustup and the protobuf-compiler that was already there).

Another option that would also improve performance is to run the whole build process within another docker container. So just running GHA-runner means the build happens within the context of docker/github-actions-runner image, which is https://hub.docker.com/r/rodnymolina588/gha-sysbox-runner (looks a bit shady, I know :). It's based on Ubuntu 22.04 LTS.

Instead, since sysbox allows running Docker-in-Docker, during the build step it's possible to spawn another docker container that will have pristine, reproducible environment to build the entire project. That benefits obviously from docker's caching layers, and additional tools like cargo-chef can be used to speed up workspace builds.

I believe @pepyakin has been working a bit on that front, so the work can easily be merged together.

Let me know what the decision is and we can coordinate off-github to add the self-hosted runner to the repo.

Thanks for the writeup @tomusdrw!

I lean towards option (2) as well and am happy to commit to it unless convinced otherwise. We have a pretty simple setup, with relatively few anticipated heavy CI builds, especially once we enable the new merge queue feature.

When it comes to optimizing the build itself,

Instead, since sysbox allows running Docker-in-Docker, during the build step it's possible to spawn another docker container that will have pristine, reproducible environment to build the entire project

Yes, that should fit well with what @pepyakin was working on.

Let me know what the decision is and we can coordinate off-github to add the self-hosted runner to the repo.

Don't we need to have the setup working correctly first, and then we can add it? let's sync up off-github

Don't we need to have the setup working correctly first, and then we can add it?

Just created #122 - for this PR to work the self-hosted runner needs to be registered as described in: https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-a-repository