/lxdrunner

Run ephemeral self-hosted runners for GitHub Actions with LXD

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

LXDRunner

Experimental daemon using LXD to run ephemeral GitHub Actions self-hosted runners.

Why use LXD and self-hosted runners ?

  • Trivial to switch between containers and VMs..
  • High density and fast startup when running containers.
  • Create your own OS images and get a pristine environment every time.
  • Automatic download and provisioning of the latest GHA runner client.
  • Access custom hardware: Serial, USB, and PCI attached devices such as phones, GPUs, etc.
  • Access sensitive resources that must be handled locally.

How it works

LXDRunner runs an API endpoint waiting on webhook events from GitHub. No LXD instances are running until needed so resource usage is minimal. Every time Actions runs a workflow an event is sent for each job. LXDRunner reacts to each workflow_job event in queued status by:

  • Mapping workflow labels to a specific LXD config ( image, profile, container type, etc )
  • Launching a pristine LXD instance based on matching config.
  • Provisioning instance with the latest GHA runner client to complete the job.
  • GHA runner automatically shuts down and deregisters when job is complete.
  • Destroying the LXD instance, just like GitHub hosted runners.

Periodic Events

  • Every 24 hours: Checks for new version of actions runner
  • Every 12 hours: Cleanup any offline runner registrations

Scaling

KISS, based only on incoming webhooks from GitHub. For each event 1 runner is launched on the fly.

Instances are ephemeral, automatically deregistered and shutdown on completion.

More complex scaling could be achieved using the GitHub API at the expense of job latency, higher API and resource usage.

After release of the ephemeral feature this is now the recommended scaling strategy. https://docs.github.com/en/actions/hosting-your-own-runners/autoscaling-with-self-hosted-runners

Limitations:

  • Workflow runs fail immediately if no runners with matching labels are registered. Remedy this by manually registering a runner with matching labels that is permanently left in the offline state. In this case runs will be queued.
  • Runner provisioning is based on bash script. Probably doesn't work on anything other than Ubuntu/Debian based distros without modification.

Setup:

LXDRunner works both with repo and organization runners. The service must be accessible over the internet in order to receive webhooks from GitHub. The API endpoint is protected by a secret and TLS. If you want to restrict access by IP you can retrieve a list of GitHub IPs using the Meta API

GitHub Setup

Repeat these steps once for each organization and/or user repo you want serviced.

LXDRunner Configuration

  • Copy config.yml.example to config.yml
  • Edit config.yml:
    • Set your GitHub PAT
    • Set your webhook secret
    • Setup remotes
    • Setup the runnermap. This section maps a set of actions workflow labels to specific LXD settings such as name, image source, profiles, container type, etc.
  • LXDrunner will search for config.yml in current directory or $HOME/.config/lxdrunner/config.yml . Specify explicit location with lxdrunner -c path/to/config.yml
  • Run some github actions workflows to test.

LXDRunner Installation

Requirements: Python 3.8 with pip

The simplest methods are using pip to install a whl package or downloading an LXD image. Download the latest release.

  • Install with pip:

    pip install lxdrunner-0.5.0-py3-none-any.whl

    Create config.yml. Start with lxdrunner -c config.yml

  • Install Alpine based LXD image:

    # Import image into LXD and start container
    lxc image import --alias lxdrunner lxdrunner-alpine.img.tar.gz
    lxc launch lxdrunner lxdrunner
    

    Create configuration file in /home/app/.config/lxdrunner/config.yml

Development

LXDRunner Development Install

Requirements: Python 3.8 with pip

  • Clone this repo. Setup python virtual env if needed.
  • Install with: pip install -e ".[dev]"
  • Create configuration as detailed below.
  • Run LXDRunner from directory : python -m lxdrunner
  • You can install lxdrunner to default locations with: pip install ./

TODO:

  • Investigate race condition between cloudinit and setup script adduser
  • Fix TLS verification
  • Dedup queue
  • Not sure pyLXD is thread-safe, investigate.
  • Explore alt provisioning methods ( prebaked images, disks mounts, etc )
  • Auto configuration of webhooks through API
  • Auto registration of offline placeholder runners
  • More logging
  • More tests

DONE:

  • Limit workers per label-set
  • Remote LXD server and image support
  • Add support for multiple label maps
  • Make changes for ephemeral fix. actions/runner issue 510
  • During startup and periodically:
    • query GH for queued runs that might have been missed or lost.
    • cleanup offline runner registrations and expired LXD workers