ai2cm/pace

Pace requires `make build`

jdahm opened this issue · 3 comments

jdahm commented

If a user runs make savepoint_tests_* from within Pace, Docker returns Unable to find image 'us.gcr.io/vcm-ml/pace:latest' locally. We need to run make build first. This artifact doesn't even exist on https://console.cloud.google.com/gcr/images/vcm-ml?referrer=search&project=vcm-ml.

jdahm commented

I did a bit of investigating into this, and it looks like we don't even have a plan that builds and pushes these any more. The tests we run on Google Cloud build the image every time locally, tag it, then run the container with that image. This isn't all bad, as it looks like some steps are cached, presumably because we use the same virtual machine...

make[3]: Entering directory '/tmp/workspace/pace-fv3core-regression_PR/backend/numpy/experiment/c12_6ranks_standard/runaction/run_regression_tests/slave/gce-cpu'
DOCKER_BUILDKIT=1 docker build \
	-f /tmp/workspace/pace-fv3core-regression_PR/backend/numpy/experiment/c12_6ranks_standard/runaction/run_regression_tests/slave/gce-cpu/Dockerfile \
	-t us.gcr.io/vcm-ml/pace \
	.
#1 [internal] load .dockerignore
#1 transferring context: 190B done
#1 DONE 0.2s

#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 593B done
#2 DONE 0.2s

#3 [internal] load metadata for docker.io/library/python:3.8.13-bullseye@sh...
#3 DONE 0.0s

#4 [1/5] FROM docker.io/library/python:3.8.13-bullseye@sha256:2a01d88a1684e...
#4 DONE 0.0s

#7 [internal] load build context
#7 transferring context: 14.90MB 1.0s done
#7 DONE 1.1s

#5 [2/5] RUN apt-get update && apt-get install -y make     software-propert...
#5 CACHED

#6 [3/5] RUN pip3 install --upgrade setuptools wheel
#6 CACHED

#8 [4/5] COPY . /pace
#8 DONE 25.3s

The next step should be to add a CI plan that runs on every commit to main and pushes a new image to gcr. Err, well now gcr is supposed to be replaced by Google Artifact Registry https://cloud.google.com/artifact-registry/docs/transition/transition-from-gcr. We could alternatively use the GitHub Container Registry https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry.

jdahm commented

What do you think @elynnwu, @mcgibbon, @twicki?

Pushing images to GCR is a pain (because of authentication), and it's not hard to build this image. I'd suggest fixing this by modifying the DEV=y pathway so it will build the image if and only if it is undefined locally.

I don't think we need to proactively change our image names right now, if we need to at all by the end of the project.