fedora-cloud/Fedora-Dockerfiles

yum update considered harmful?

Opened this issue · 8 comments

At the top of each Dockerfile is:

yum -y update && yum -y clean all

At this time this updates 82 packages and installs one dependent package, when the Dockerfile is based on the vanilla fedora image, ie:

FROM fedora:latest

This will bloat the image. Because the dependant images are built at different times the contents of the fedora updates repository will differ and docker will not be able to cache the intermediate steps.

ie. If you install more than one of the images from this repository you will end up with the original fedora release, and the updates for that release N times.

So why run 'yum update' at the top of each Dockerfile? You might say that it is important to not have images with outdated packages, but then the published base docker image seems outdated.

Why not periodically update the base fedora docker image, and link each of the derived images to it on the docker index so that when it updates the derived images are rebuilt? Then there will be exactly one copy of the bits in an up-to-date fedora image backing each derived image.

Hey @groks, I agree that for production setups you wouldn't want to run yum update on a container. You'd always want to be on a known good package level. Running yum update throws that away. For Fedora images, I figured most would want to try the latest of what was out. The base image is always going to be out of sync from an updated layered image (unless you update the base image daily). I think some of these things are still to be figured out. So, we have 3 things that can get out of sync:

  1. base image
  2. layered images (after yum update)
  3. images stored on Docker hub (these aren't rebuilt daily or even weekly either). They are working on policy for this.

I need to mull this over a bit more, but feel free to /cc anyone else who you think might have input. I agree these things need more thought.

@scollier I think that Fedora-Dockerfiles is about learning & teaching. I don't think that guys would use those images on prod and I agree that update'ing packages is rather good practice as you're sure you're playing with the latest toys.

But @groks groks really has a point. Maybe we should add a comment to each Dockerfile, that yum update step is not always that cool, and user should rethink if it's really what he wants?

The ultimate goal here is to update the base image regularly. We just have some release engineering and QA work to get to the point where we feel good about doing that. #helpwanted

In the meantime, is there a straightforward way of updating the local copy of the base image?

The simplest solution here is to have a fedora/dockerfiles-base image of some sort that does the yum update, and then base all the other images on that one. This solves both problems: (a) you can ensure that the fedora-dockerfiles images have a recent package set, and (b) you conserve space because everything is based on a common image, rather than performing the updates independently.

znmeb commented

The problem is that the Docker base Fedora image doesn't update whenever a package needs to be updated. That needs to be part of the release / QA process and automated. Till that's done, a Fedora-based Dockerfile has to do a yum or dnf upgrade to make sure the base packages are up to date.

Related to all of the above, having separate

RUN yum -y update && yum clean all
RUN yum -y install ...

means that we take the hit of downloading 40MB of repo metadata twice. In my Dockerfiles, I'm just concatenating them into one line.