seanrmurphy/nix-container-build-gha

nvidia-docker does not work when `dockerTools.buildLayeredImage` contains packages grouped with `buildEnv`

Opened this issue · 1 comments

Hi!
Thank you for all of the information you are providing in your blog and in this repo.
I am rather new to nix and stumbled into a weird problem. I was wondering if this happened to you or if this is a potential Nix bug.
If I try to group packages with pkgs.buildEnv and put those into contents = [ ] of buildLayeredImage. Running docker --gpus=all (from ubuntu 20.04) fails with the following error:

CI runtime create failed: runc create failed: unable to start container process: error during container
init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: 
file creation failed: /mnt/docker/overlay2/9264fff237fb7d187a2669389670f00aa51d380173e095a53f1c992e5d715ed8/merged/lib/
firmware/nvidia/550.54.15/gsp_ga10x.bin: file exists: unknown.

However, if I replace buildEnv with symlinkJoin, it works.
Also, if I still use buildEnv, but build the docker image with buildImage rather than buildLayeredImage, it also works.

PS. This driver file: gsp_ga10x.bin, definitely does not exist in the docker image.

This happens also with the flake.nix provided in this repo (with a minimal image, no python, but packages are grouped with buildEnv).
What do you make of this?
Thank you!

Hi @Avnerus - thanks for this and apols for being slow to respond here - lemme look into this and get back to you later this week.