uber-archive/makisu

Hard links image size bug, also a COPY bug?

LouisStAmour opened this issue · 1 comments

Hi folks,

I thought I'd make a newer Alpine image and maybe install another docker tool while I was at it.

I ended up with a test Dockerfile like the following:

FROM alpine:3.13.2
COPY --from=gcr.io/uber-container-tools/makisu:v0.4.1 . .
RUN apk add --no-cache git skopeo
ENTRYPOINT /bin/sh

But Makisu produced a docker image that was 437 MB, and 373 MB of that was in the /usr/libexecgit-core directory according to dive.

By comparison, the above image when produced with Docker for Mac or Kaniko is 85 MB according to Dive and looks basically identical in both. Now, Kinako has its own speed issues, and I liked the caching options and simplicity of the code for Makisu, but if the git package for Alpine uses hard links and Makisu can't handle that, well, I'd rather not copy around an extra 300+ MB just for git's hard links.

Bonus bug: the second line COPY --from=gcr.io/uber-container-tools/makisu:v0.4.1 . . does not do what it's supposed to with Makisu. I expected it to copy the /makisu-internal folder and contents from the image into the new image. Both Kaniko and Docker for Mac do what I expect, but Makisu v0.4.1 seems to treat it as a no-op, seemingly copying nothing at that step.

I also encountered strange behavior regarding hard links using v0.4.2. So i just put my comment here since it could be related. Take the following Dockerfile:

FROM busybox:stable
RUN echo "foo"

Busybox image has a size of ~1MB. Using dive, i've noticed that the echo "foo" step seems to eliminate the links from the commands to the single binary file of busybox. Instead, it appears that the content of the original single binary is copied for each single command - so 1MB for ls, 1MB for awk, etc. ... . Instead of a new image of ~1.1MB (e.g. using docker build), the Makisu-built image ends up having 457MB.

1MB -> 457MB just by adding echo "foo"!