kubernetes-sigs/image-builder

Include ecr-credential-provider binary in CAPI images

nickperry opened this issue · 8 comments

Is your feature request related to a problem? Please describe.

There is a regression in 1.27+ based image-builder capi images, as unlike with <= 1.26.x capi images, it is no longer possible for the Kubelet to pull images from private ECR repos.

In Kubernetes 1.27, the in-tree kubelet credential provider for AWS was removed (kubernetes/kubernetes#116329). This followed GA of the external kubelet credential provider feature in 1.26.

Describe the solution you'd like
We would like the CAPV OVA templates published at https://storage.googleapis.com/capv-templates/ to include the ecr-credential-provider binary.

It is suggested that the binary be dropped into /usr/local/bin.

If users wish to make use of the credential provider, they will be responsible for configuring the following items via KubeadmConfig.Files and KubeadmConfig.PreKubeadmCommands or another method of their choosing:

  • IAM credentials.
  • Creation of a CredentialProviderConfig file, located for example at /etc/kubernetes/credential-provder-config.
  • Adding --image-credential-provider-bin-dir=/usr/local/bin --image-credential-provider-config=/etc/kubernetes/crededntial-provider-config to KUBELET_EXTRA_ARGS.

Describe alternatives you've considered
Use KubeadmConfig.PreKubeadmCommands to retrieve the ecr-credential-provider binary as each node is bootstrapped.
Custom building images.

Additional context
At my organisation, we pull most of our images from private ECR repos, so since this removal in 1.27, we need the external ecr-credential-provider binary in our CAPV machines.

It would make sense to co-ordinate the solution for this with the CAPA maintainers.

I am certain that AWS customers will want this in CAPA images. It is also fairly likely that some commercial Tanzu customers will want this functionality.

Image-builder's custom_role functionality may be useful.

To the best of my knowledge, there is no publicly hosted binary artefact for ecr-credential-proivder. The source can be retrieved from https://github.com/kubernetes/cloud-provider-aws/releases/tag/v1.27.1 and built with:

cd cloud-provider-aws-1.27.1/cmd/ecr-credential-provider
go build 

The ecr-credential-provider binary is ~20MB or 4.5MB after stripping and compressing with goupx.

/kind feature

Example CredentialProviderConfig file contents:

apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
  - name: ecr-credential-provider
    matchImages:
      - "*.dkr.ecr.*.amazonaws.com"
    defaultCacheDuration: "12h"
    apiVersion: credentialprovider.kubelet.k8s.io/v1
    env:
      - name: AWS_PROFILE
        value: default 

@nickperry have you managed to have a working configuration for this? 🤔

@nickperry have you managed to have a working configuration for this? 🤔

I tested the ecr-credential-helper fine in CAPI - I just had to retrieve the binary using curl as each node was bootstrapped. However, I am hoping we can get agreement to include it in public CAPI images so that we can avoid having to do that in the future.

At my company we are now just building our own custom VM templates, which include ecr-credential-provider.

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

I'd like to see it included as well

/remove-lifecycle stale

Ideally this should be possible with a custom_role but if I understand correctly there's no place where this binary exists pre-built?

There is an issue upstream about this which seems to actually suggest the binaries are available, just not documented.

The binaries can be found here: https://console.cloud.google.com/storage/browser/k8s-artifacts-prod/binaries/cloud-provider-aws;tab=objects?prefix=&forceOnObjectsSortingFiltering=false

Would someone who is interested in this feature be willing to give it a try using custom_role and report back on how it went?