A GitHub Actions Caching server for self-hosted runners. By using terrycain/cache@custom-url you can now upload job caches to a local server.
This repo is still in an alpha like state, thing will most likely change, especially the Helm charts and Kustomize manifests.
I have a variety of self-hosted runners to run actions for some private repos as I'd rather not pay for build minutes. I decided to look into building some Go projects with Bazel, all was going well until I tried to speed up build times with actions/cache. It turns out Bazel's build cache is rather large, and where I'm building these actions does not have the best upload speed so I decended the rabbit hole getting self-hosted runners to cache using an alternate caching sever. See here for a more in-depth view on how GitHub actions caching works.
To start with you will need to update your actions/cache
step and use my fork until I can PR the changes upstream. Below is an example
of a cache action using an external server (github repo here):
- name: Cache test
uses: terrycain/cache@custom-url
with:
external-url: "http://172.20.0.20:8080/"
path: /tmp/test1234/
key: ${{ runner.os }}-docker-${{ github.sha }}
restore-keys: |
${{ runner.os }}-docker-
For now, there should not be any additional path on the server's URL. The forked action currently takes the path of the original actions server and appends it to the url as this path from the original url contains some repository identifier. There is no fundamental reason why the caching server could not run on a subpath but that's just not done yet (feel free to PR).
The server consists of 2 major parts, a part that deals with cache metadata, and a part that deals with cache storage. I have tried to design the server with some degree of modularity in mind so that it can fit in whatever environment so eventually there will be various database and storage backends to choose from. There will be some tradeoffs depending on what combination you choose, e.g. if you choose SQLite for the database, you'll probably not want to read/write to it from multiple processes (albeit the docs seems to claim this works now :/).
Docker image: ghcr.io/terrycain/actions-cache-server:0.1.7
(image repo if I forget to update the image tag)
Running --help on the container will list all arguments it takes, which all can be defined as environment variables. You will need to specify a --db-something
and --storage-something
argument.
See BACKENDS.md for a more detailed description of each backend and the format of the args.
Example deployment using Docker which will use SQLite and basic disk storage.
mkdir db cache
docker run --rm -it -p8080:8080 -v $(pwd)/db:/tmp/db -v $(pwd)/cache:/tmp/cache docker pull ghcr.io/terrycain/actions-cache-server:0.1.3 \
--db-sqlite /tmp/db/db.sqlite \
--storage-disk /tmp/cache \
--listen-address 0.0.0.0:8080
Type | Name | Supported |
---|---|---|
Database | SQLite | ✔️ |
Database | Postgres | ✔️ |
Database | MySQL | ❌ Will do if there is demand |
Database | DynamoDB | ❌ Will do if there is demand |
Database | MongoDB | ❌ Will do if there is demand |
Database | CosmosDB | ❌ Will do if there is demand |
Storage | Disk | ✔️ |
Storage | AWS S3 | ✔️ |
Storage | Azure Blob Storage | ✔️ |
Storage | Google Cloud Storage | ❌ Will do if there is demand |
GitHub Actions seem to indicate that caching is partially shared to forks, this is not supported and currently have no idea how to make it work nor the intention to. If someone really needs this, then raise an issue and we can look into it.
Roadmap sounds better than a glorified todo list 😄
- Azure Blob Storage backend
- Cache space usage / management
- Benchmark cpu and memory usage especially on large PATCH's
Feel free to raise Pull Requests. For any major changes please raise an issue so that they can be discussed and avoid any duplication of work etc...
Obviously update/add tests where appropriate. Some info around testing is here