tinkerbell/tink

Action Images are very tightly tied to OSIE

Closed this issue · 3 comments

Tink starts by booting OSIE, OSIE runs the workflow engine, the workflow engine runs some Action Images via docker containers, and hopefully at the last step in the workflow the system boots in to the freshly installed OS.

These Action Images, though, are limited by what OSIE can do. In fact, this binding is quite tight and probably by mistake.

Some examples of this:

  • BSDs can't really do a "native" installation by running their own software, they have to first create an installer which works under Linux.
  • Windows is similar, probably closer to spewing bytes to disk than anything else.
  • The OS must install to a filesystem supported by OSIE
  • Hardware must be supported by OSIE. Action Images can't bring their own hardware support
  • Lots of tools actually have to be "matched pairs": where the CLI utility version is tied to a specific version of the kernel module. For example: wireguard, zfs, iptables, nvidia, among many others. Incompatibilities can cause simple breakage, instability, or crashes.

One way I was able to get around this this was by:

  1. use a privileged docker container to mount /etc
  2. examine /etc/issue
  3. create a docker container on the fly based on that version of alpine
  4. build the packages I required
  5. export them to a package archive
  6. use a privileged docker container to mount the host's / to /hostroot in the container
  7. chroot /hostroot apk add ... to install my required kernel modules to the host
  8. chroot /hostroot modprobe ... to load the kernel modules

but this is not actually a viable solution. In particular, I was stuck in mud for a bit by osie using a somewhat / lightly patched version of alpine's kernel, making it tricky to do this custom build.

Some different ways to work around this might be:

  1. support Action Images being light-weight VMs, via something like qemu.

This option is most flexible as these VMs could be minimally small but support exactly the hardware and VM the user needs. Further, they can be built with matched pairs of kernel modules to CLI tools.

  1. allow a workflow to specify its own URL or ID or name for its OSIE workflow base

This is probably desirable anyway. Like I mentioned the Action Images are quite tightly tied to what the host provides. This sort of binding can be hard to upgrade, especially when the purpose of the action images are to fiddle bits with the hardware. Being able to pin and version them would likely help a lot.

With the sandbox moving to https://github.com/tinkerbell/hook is this still an issue, @grahamc?

I think "support Action Images being light-weight VMs, via something like qemu." would be really cool. I've added a help wanted label because I think we need someone to build a demo/proof of concept for this idea.

I've raised a discussion in the Tinkerbell Roadmap where we can figure out if and what we could do for this. Closing the issue as we don't necessarily intend on fulfilling it.

tinkerbell/roadmap#10