cml runner seems to try and pull images from a quay.io repo instead of dockerhub
AlistairMaccallum opened this issue · 6 comments
Bug Report
runner:
Warning Failed 8s (x2 over 52s) kubelet Failed to pull image "dvcorg/cml:0-dvc2-base1-gpu": rpc error: code = Unknown desc = reading manifest 0-dvc2-base1-gpu in quay.io/dvcorg/cml: unauthorized: access to the requested resource is not authorized
Description
cml runner seems to try and pull images from a quay.io repo instead of https://hub.docker.com/r/iterativeai/cml/tags
Reproduce
cml runner launch
--cloud=kubernetes
--labels=cml-k8s-gpu
Expected
Kubernetes is able to pull the required image for the Job
Environment information
Attempting to run jobs via github arc and kubernetes in an on-premise cluster.
Additional Information (if any):
@AlistairMaccallum I suspect this is coming from some Openshift k8s configuration.
After some searching the contents of cat /etc/containers/registries.conf
may shed some light on this.
Regardless, you should be able to resolve this by explicitly setting the image like so:
cml runner launch \
--cloud=kubernetes \
--labels=cml-k8s-gpu \
--cloud-image=ghcr.io/iterative/cml:0-dvc3-base1
or your choice of image.
Thanks @dacbd that's working now, however, is there a way to provide credentials or indicate an existing k8s registry secret to use within the the cluster to pull from a private ecr repo?
Just for reference, my /etc/containers/registries.conf
looks like this, so not sure why it was trying quay.io first, either way, it's probably better to be explicit about the image used.
# # An array of host[:port] registries to try when pulling an unqualified image, in order.
unqualified-search-registries = ["docker.io", "quay.io"]
@AlistairMaccallum do you mean like this: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
or are you trying to pull another image from your workflow that is running in the cml image?
@dacbd Yes, I have a k8s registry secret like what is described in the link, I'm trying to run cml like this
cml runner launch \
--cloud=kubernetes \
--labels=cml-k8s-gpu \
--cloud-image=aws-id.dkr.ecr.aws-region.amazonaws.com/my-image:my-tag
I had tried this as part of the job but I suspect it doesn't help because the k8s cluster needs the credential to pull the image rather than the container the action is running in.
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4 # More information on this action can be found below in the 'AWS Credentials' section
with:
aws-region: eu-west-2
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
@AlistairMaccallum, thats correct, it would be k8s doing the pulling of the container and not the cml command.
I'm sure there are plenty of help articles out there for accessing ECR from your k8s cluster. If you get stuck feel free to reach out again but I'm not sure how much help we can be.
@dacbd This seems to answer my question #1342 however I think a more intuitive way would be to have an additional flag for the runner something like this maybe? Where you can specify a secret that exists in kubernetes already.
cml runner launch \
--cloud=kubernetes \
--labels=cml-k8s-gpu \
--cloud-image=aws-id.dkr.ecr.aws-region.amazonaws.com/my-image:my-tag \
--cloud-image-secret=myregcred