chainguard-dev/rules_apko

Shared lock file

Opened this issue · 2 comments

Hello friends!

First, I want to thank you for your work on this so far. It's a really nice little way to put together alpine containers and leverage the developing wolfi ecosystem.

I was hoping to get some perspective on what a good pattern might be for sharing a lockfile between multiple instantiations of apko_image. In a larger repository, where I'd want to create multiple images, there's going to be a substantial number of translate_apko_lock calls within my MODULE file, each of which are going to manage a separate cache of these (potentially common) dependencies. What would be nice, is if I could have a single lock file that reflected a snapshot of dependencies from my upstream, which could then be shared between multiple different apko_images. In this model, I'm able to ensure that multiple containers are using the same version of some dependencies (and therefore upgrade together), and only need to manage one lockfile. This also makes it easier if I would like to wrap the apko_image in a macro or rule, knowing that I wouldn't have to modify the MODULE as well.

I think I could hack this right now, by having a beefy config file that contained all expected dependencies from the upstream, locking that config, and then passing it into translate_apko_lock so it could then be used as the contents for all my apko_images, but that seems a little inelegant. I see how this could maybe reflected as a custom rule, but it would require access to the private apko_run rule.

Would love your thoughts on whether or not this is a common usecase, and if so, what tweaks could be made to make if possible. I'm happy to assist in a PR.

Thanks!

I think I could hack this right now, by having a beefy config file that contained all expected dependencies from the upstream, locking that config, and then passing it into translate_apko_lock so it could then be used as the contents for all my apko_images, but that seems a little inelegant.

Having just tried this, note it doesn't work with the current apko/rules_apko tools. It seems the lockfile is treated as the actual packages to install, and the list in apko.yaml is silently ignored (except for some misleading informational output).

Just wanted to dump some thoughts on current implementation and why avoiding repository_rule per image might be unavoidable. Just a disclaimer - It's likely some information here is not correct :D

  • repository_rule is a way to download the remote contents in bazel (in the end most of the calls end up with rctx.download)
  • When any of the file in the repository is referenced, then whole repository is downloaded.
  • translate_apko_lock generates multiple repositories
    • primary: The one that contains the contents target and apko_repositories macro, which declares all secondary repository rules
    • secondaries: For each package there is a separate repository_rule, which handles download of the single package and exposes target for it.
  • The contents target must declare all packages' targets as dependencies. This is done by generating a BUILD.bazel in the primary repository of translate_apko_lock
  • Dependencies must be declared explicitly for analysis phase
  • To get the list of dependencies you need to read the lockfile, which for build rule can only happen in execution phase. On the other hand, repository rule can read the file

There could be a model where there is a one big translate_apko_lock that contains all of the packages' targets, but lockfile is provided separately to the apko_image

apko_image(
  contents = all_packages_for_all_images
  lockfile = lockfile_for_specific_image
)

But then building a single image would download packages needed for all images in your repo. If your usecase is bazel build //... then it doesn't matter much (except much more symlinks that will be added for each apko_image rule) as eventually you would download all the packages anyway. But as the generic solution this would be wasteful.

Note about the repository cache: it is content addressable cache by hashes, so even if multiple lockfiles contain the same package, it will be downloaded only once.