DeterminateSystems/magic-nix-cache-action

Feature request: use a single cache entry

deemp opened this issue · 3 comments

deemp commented

Hi!

I used your GH Action and got a GH API error just as you warned. Also, I saw that GH Caches had a lot of entries just like in this (magic-nix-cache-action) repo.

There are other approaches to caching the store.

Approach 1

Use cachix.
Advantages:

  • Caches only new paths
  • Avoids caching paths from public binary caches
  • Provides LRU cache eviction.
  • A cache can be shared

Disadvantages:

  • May be down
  • May be slow
  • Limited cache size

Approach 2

Host a custom binary cache like attic.
Didn't use. I suspect it has approximately the same advantages and disadvantages the cachix option.

Approach 3

Use a chroot local store.

Advantages:

  • Fast restore and save

Disadvantages:

  • Doesn't work on macOS
  • Need to adjust symlinks after nix build as they point to /nix/store/bar and not to the actual prefix/nix/store/bar.

Approach 4

Use an intermediate store with nix-store --{import,export} or nix copy --{from,to} (cachix/install-nix-action#56), possibly keep only a working set of paths by using atime.

Attempts:

Advantages:

  • Get a single Caches entry
  • Cache only a working set of paths

Disadvantages:

  • Can't restore selectively
  • Need to wait for restoring and store copying (spoiler: really slow)

Approach 5

Don't use an intermediate store.
Use nix-quick-install-action to restore and save /nix directly (nixbuild/nix-quick-install-action#33).

Advantages:

  • Really fast

Disadvantages:

We're also struggling with rate limits. @batteries-included has a nix mono-repo with nix + rust + node + elixir. Each language has its dependencies, and as such, the number of things in the store are very high. Most of the starting deps are node packages or very small rust crates.

In the end we have a large dependency tree like this:

graph TD;
  WebCSSBundle-->NodeBuildDerivation;
  CSSSourceCode-->NodeBuildDerivation;
  JSSourceCode-->NodeBuildDerivation;
  NodeBuildDerivation-->NodeDep1;
  NodeBuildDerivation-->NodeDep2;
  NodeBuildDerivation-->NodeDep3;
  NodeBuildDerivation-->NodeDepN;
  
  RustBinary-->RustBuildDerivation;
  RustSourceCode-->RustBuildDerivation;
  RustBuildDerivation-->CargoDep1;
  RustBuildDerivation-->CargoDep2;
  RustBuildDerivation-->CargoDep3;
  RustBuildDerivation-->CargoDepN;
Loading

Any change to the input source code for a build derivation results in cache misses on the build derivation, and the need for all of the inputs.

We tried cachix. It's useful, but since each github worker starts with an empty store, we sometimes spend as many as 10 minutes fetching the small dependencies sequentially. Fetching from cachix or fetching from npm are around the same speed.

magic-nix-cache-action is faster for us than cachix (comparing on a warm branch or on master). However it still struggles with rate limits, like before we spend a large amount to time sequentially fetching dependencies. Fetching from the github cache in a GitHub worker is faster than going to npm or cachix. So this is a win. However, we still see a lot of failures in reading from the cache because of the number of dependency cache fetches. These failures then propagate into failures to push into the cache.

I would love to see:

  • A way to get the last N or most frequently used N paths from store and restore them in one cache action.
  • A way to put the last N or most frequently used N paths back into the store
  • A way to wait a while ensuring the cache is written to at the end. (I would trade 60 seconds of sleep/retry time for just giving up on the first error)
deemp commented

@elliottneilclark

  1. As for "the last N used paths", the Approach 4 allows to see what paths were accessed during a run.
  2. I believe it's possible to run a watcher like inotifywait and record for each file how many times it was accessed.
    However, I'm not sure it's possible to partially restore a cache. So, one cannot cache N most frequently used paths and then restore only M of them.
  3. In case of the last approach, writing a cache is just a serialization of nix store paths via nix-store --export so it's unlikely to fail. Caching can fail in case there's no Caches space left. As the last approach allows to cache the working set of paths, one can keep only the latest relevant cache.

These suggestions are just crutches. I hope someone makes an action after considering these ideas. Currently, I have no experience in writing GH actions and not much time.

deemp commented

@elliottneilclark , I'm currently going to use the nix-quick-install-action + actions/cache approach described in https://github.com/nix-community/cache-nix-action as it solves the original issue. Feel free to reopen this issue.