argoproj/argo-cd

Unable to use OCI Registry with sub helm charts

jascsch opened this issue Β· 19 comments

Describe the bug
You cannot create a new app which uses a GIT repo URL as a source and use a sub helm chart with an OCI Registry.
The Chart.yaml looks like this:

apiVersion: v2
name: les-service
type: application
version: 1.0.0

dependencies:
  - name: les-service
    version: "1.0.0"
    repository: oci://registry.app.corpintra.net/les

To Reproduce
See screenshot

Expected behavior
The sub helm chart needs to work with the OCI registry like before with a chart repo URL --> https://registry.app.corpintra.net/chartrepo/les

Screenshots
image

Version
v2.6.1+3f143c9

Logs

Unable to create application: application spec for ass is invalid: InvalidSpecError: Unable to generate manifests in env: rpc error: code = Unknown desc = helm dependency build failed exit status 1: Error: could not download oci://registry.app.corpintra.net/les/les-service: failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://s3-edc.emea.svc.corpintra.net/edcs3dhccaasprodb/docker/registry/v2/blobs/sha256/24/2431b3f41cb1c457ca3828492ab626f446096b6205e297694a913e8e534416c2/data?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=7cbc312f5c926f7fc1ad%2F20230113%2Feddhcbuckets%2Fs3%2Faws4_request&X-Amz-Date=20230113T153231Z&X-Amz-Expires=1200&X-Amz-SignedHeaders=host&X-Amz-Signature=e9c0e391e10a05307d4a27286ccec4c46553f5fba7fb4751a355fefbdec7c549": Forbidden

  • Do you have an AppProject which has registry.app.corpintra.net/les in its sourceRepos?
  • And/or have you configured a repository with the suitable credentials to access that helm repo?

It's not very intuitive, but hopefully will be made more clear with #12255

@blakepettersson
I cannot add any appProject with registry.app.corpintra.net/les as a sourceRepo because this always ends up in the "Forbidden" error message.

I have configured a repository with the suitable credentials like this:
argocd repo add registry.app.corpintra.net/les --enable-oci --type helm --name registry --username 'robot$les+robo' --password xxx --project default

I can confirm that the repo connection status is successful:
image

Nevertheless I cannot create any appProject which has access to registry.app.corpintra.net/les because this still ends up with the forbidden message.

I have also tried to add repocreds like this but this also makes no difference:
argocd repocreds add registry.app.corpintra.net --enable-oci --type helm --username 'robot$les+robo' --password xxx

Please note that registry.app.corpintra.net/les is a public repo so there are no credentials required to pull any helm charts from this repo.
The helm pull command works perfectly fine from my wsl ubuntu so I do not understand why ArgoCD cannot handle it.
helm pull oci://registry.app.corpintra.net/les/les-service --version 1.0.0 --untar

This is how my appProject looks like:

spec:
  destination:
    namespace: dev
    server: https://kubernetes.default.svc
  project: default
  source:
    helm:
      valueFiles:
      - values-dev.yaml
    path: env
    repoURL: https://git.i.mercedes-benz.com/LES/msag-clamav
    targetRevision: HEAD

And I can confirm that this works if I am using a helm chart repo instead of an OCI registry.

apiVersion: v2
name: les-service
type: application
version: 1.0.0

dependencies:
  - name: les-service
    version: "1.0.0"
    repository: https://registry.app.corpintra.net/chartrepo/les

So this has to work the same way for an OCI registry like oci://registry.app.corpintra.net/les without specifying any repo creds or something like this.
This is a very frustrating issue because we are forced to switch to OCI helm charts and the helm chart repos will be deprecated, soon.
Could you please consider fixing this issue or at least giving a feasible workaround?

And yes registry.app.corpintra.net/les is currently set as allowed source repo for the default project:
image

Could you please consider fixing this issue or at least giving a feasible workaround?

@jascsch I use OCI dependencies myself and have seen this work with my own eyes. Perhaps it's possible that you bumped into the same issue I did in #10218?

If you are talking about this one:

  1. Create a suitable secret
  2. Create a new Helm chart in a Git repository with a Helm dependency which makes use of said secret
  3. Sync, verify all is good
  4. Delete the secret
  5. Do a hard refresh
  6. The app will now error, and no amount of hard refreshes or recreations of the application will allow the application to sync its manifests, which is fine for now
  7. Recreate the same secret
  8. Try to do another hard refresh
  9. Although the secret is back, no amounts of hard refreshes or recreations of the application will ever get it back to a good state

Sorry but that this is not a feasible workaround for me.
I already spend hours to fix this but it has to be a bug in my case. ItΒ΄s wrong that there is a forbidden message because it`s a public repo and this has to work without repo creds.

I'm currently facing exactly the same problem as described by @jascsch We're also forced to use OCI but when I try to use it as dependency, I'm also always getting a forbidden message, although it's public and I can fetch it using helm pull.

So I also think that this should be an issue of argo. Fixing this issue is very appreciated and import for me, too.

I've been digging through all of the related issues that I can find on this, and I'm in the same position as others.

Our scenario:

  1. We have helm chart dependencies in Google Artifact Registry using OCI
  2. In kubernetes (GKE), we rely on workload identities to grant rights to individual pods / workloads
  3. In ArgoCD, we apply a helm chart from git which includes the OCI dependency chart
  4. helm dependency build fails with a 403 due to an anonymous token being passed

Expectation:

Since the pod(s) running ArgoCD applications have workload identities granted via IAM, we should not need to pass credentials explicitly. The fact that the pod has proper bindings to IAM means that the request should essentially be possible by using the access token method where the token is granted by the GCP metadata server upon request.

Reference:
https://cloud.google.com/artifact-registry/docs/helm/authentication

Related:
#12554
#12392

I have a workaround for this issue that applies to GCP / Google Artifact Registry when using workload identities:

  1. Build a new container with gcloud:
    Dockerfile
FROM google/cloud-sdk:latest as gcloudSDK

FROM argoproj/argocd:v2.6.0

COPY --from=gcloudSDK /usr/lib/google-cloud-sdk /usr/lib/google-cloud-sdk

# switch to root
USER 0

RUN \
  ln -s /usr/lib/google-cloud-sdk/bin/gcloud /usr/bin/gcloud && \
  ln -s /usr/lib/google-cloud-sdk/bin/docker-credential-gcloud /usr/bin/docker-credential-gcloud

RUN \
  apt-get update && apt-get install -f -y curl

# switch back to argocd
USER 999

  1. Modify argo-cd-repo-server (example given for helm)
    values.yaml
  repoServer:
    # Default image used by all components
    image:
      # -- If defined, a repository applied to all Argo CD deployments
      repository: us-docker.pkg.dev/myregisry/container/argocd
      # -- Overrides the global Argo CD image tag whose default is the chart appVersion
      tag: v2.6.0-gcloud
      # -- If defined, a imagePullPolicy applied to all Argo CD deployments
      imagePullPolicy: IfNotPresent

    serviceAccount:
      annotations:
        iam.gke.io/gcp-service-account: argo@myproject.iam.gserviceaccount.com

    securityContext:
      runAsUser: 999

    volumes:
      - name: gcloud
        emptyDir: {}

    volumeMounts:
      - mountPath: /gcloud
        name: gcloud

    env:
      - name: CLOUDSDK_CONFIG
        value: /gcloud/
      - name: DOCKER_CONFIG
        value: /helm-working-dir/.docker

    initContainers:
      - name: activate-helm-creds
        image: us-docker.pkg.dev/myregisry/container/argocd:v2.6.0-gcloud

        securityContext:
          runAsUser: 0

        volumeMounts:
          - mountPath: /helm-working-dir
            name: helm-working-dir
          - mountPath: /gcloud
            name: gcloud

        env:
          - name: CLOUDSDK_CONFIG
            value: /gcloud/
          - name: DOCKER_CONFIG
            value: /helm-working-dir/.docker

        command: ["/bin/bash", "-c"]
        args:
          - gcloud init;
            ( yes | gcloud auth configure-docker || test $? -eq 141 );
            ( yes | gcloud auth configure-docker us-docker.pkg.dev || test $? -eq 141 );
            chown -R 999:999 /gcloud;
            chown -R 999:999 /helm-working-dir;
            find /gcloud -type f -exec chmod +r {} \; ;
            find /gcloud -type d -exec chmod +rx {} \; ;
            find /helm-working-dir -type f -exec chmod +r {} \; ;
            find /helm-working-dir -type d -exec chmod +rx {} \; ;
  1. Setup workload identity bindings (example given for ConfigConnector)
apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPartialPolicy
metadata:
  name: sa-argo-cnrm
spec:
  resourceRef:
    apiVersion: iam.cnrm.cloud.google.com/v1beta1
    kind: IAMServiceAccount
    external: projects/myproject/serviceAccounts/argocd@myproject.iam.gserviceaccount.com
  bindings:
    - role: roles/iam.workloadIdentityUser
      members:
        - member: serviceAccount:myproject.svc.id.goog[argocd/argo-cd-repo-server]
  1. Disable helm.passCredentials for your application managing deployments
  2. Sync your failing application. If an error is cached, you may need to delete / recreate the offending app.

We have a workaround for this issue. Instead of using a Chart.yaml file we decided to specify multiple sources and reference the helm chart as a git repo like this:

kind: Application
metadata:
  name: msag-clamav-dev
  namespace: argocd
spec:
  project: default
  sources:
    - repoURL: 'https://git.i.mercedes-benz.com/LES/msag-clamav'
      targetRevision: develop
      ref: myRepo
      helm:
        releaseName: msag-clamav
    - repoURL: 'https://git.i.mercedes-benz.com/LES/helm.git'
      path: charts/les-service
      targetRevision: main
      helm:
        releaseName: msag-clamav
        valueFiles:
          - $myRepo/env/values-dev.yaml
  destination:
    server: https://kubernetes.default.svc
    namespace: dev

Isn't that a different scenario @jascsch ? In your example you are just fetching the values.yaml from a different repository, however if your helm chart uses OCI as dependencies in the Chart.yaml, the same error as describe in the topic will occur, right?

That is actually a workaround because using OCI as dependencies in the Chart.yaml does not work.
The advantage of this solution is that you do not need to use a Chart.yaml in your repo and you do not need to maintain the helm chart in the OCI registry but only in the helm repo.

In the end it serves the same purpose to deploy apps via a central helm chart repo.

Hi, I encounter very similar error that is described in origin of this thread and also in thread #11717.

After upgrade from ArgoCD v2.7.7 to v2.7.8 (and any upper version, my main destination is latest v2.8.2) I see this error

(v2.8.0+)

Failed to load target state: failed to generate manifest for source 1 of 1: rpc error: code = Unknown desc = Manifest generation error (cached): `helm dependency build` failed exit status 1: Error: could not download oci://docker.bcas.cz/cz/bcas/app/crm/crm/helm/crm: failed to authorize: failed to fetch anonymous token: unexpected status from GET request to https://gitlab.bcas.cz/jwt/auth?scope=repository%3Acz%2Fbcas%2Fapp%2Fcrm%2Fcrm%2Fhelm%2Fcrm%3Apull&service=container_registry: 403 Forbidden

(v2.7.8-2.7.X)

rpc error: code = Unknown desc = Manifest generation error (cached): `helm dependency build` failed exit status 1: Error: could not download oci://docker.bcas.cz/cz/bcas/app/broker-coach/broker-coach/helm/broker-coach: failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden

at my Applications which depend on some Helm Chart in OCI repository. When I create directly Application from OCI repo, then Argo will download and build Helm Chart correctly. But when it is depended in some parent Chart, then Argo cannot download it during build and throw this error.

Repository seems to be OK.
image

Project hase assigned this repository as source repository.
image

This is what my helm dependenci definition looks like
image

And this is what my repository secret for Argo looks like

apiVersion: v1
kind: Secret
metadata:
  annotations:
  labels:
    argocd.argoproj.io/secret-type: repository
  name: argocd-repo-cz.bcas-helm
  namespace: argo-cd
type: Opaque
data:
  enableOCI: enable
  name: cz.bcas-helm
  password: XXXXXXXXXXXXXX
  project: applications
  type: helm
  url: docker.bcas.cz/cz/bcas
  username: XXXXXXXXXXXXXX

Is there anyone still working on this problem? And can it help with debugging and fixing?

I just wanted to add my two-cents since I saw this and was almost ready to give up hope.

I was able to get this working with v2.8.4+c279299, although it was very subtle and I don't have the minimum test case yet.

Just a check list:

  1. Ensure passCredentials is true, in the Application manifest.
  2. Ensure that your Project has access to the source repository.
  3. Ensure that you have created an HTTP (OCI) Repository.

To debug whether things were working or not, I actually just created an application pointing to the Library Chart.

project: MY_PROJECT
source:
  repoURL: quay.io/<PRIVATE>/<SUB>
  targetRevision: <VERSION>
  helm:
    passCredentials: true
  chart: <CHART>
destination:
  server: 'https://kubernetes.default.svc'
  namespace: my-namespace
syncPolicy:
  automated:
    prune: true
    selfHeal: true
    allowEmpty: true

Even though the repoUrl pointed to a library chart the errors that I got were MUCH more transparent the the random 401 from helm dependency build.

At the end of the day, if everything is working, the application should fail with
Error: library charts are not installable

At that point, after 4 hours my other application in the same project that referenced that library chart was able to run.

Something odd: I noticed at some point I saw error messages saying I couldn't access "quay.io/repository//SUB>" and I'm not sure where repository came from. It might be something specific to quay.io (or maybe it was the response to an http redirect), in any case, I did add that to the general repository configuration, and the project specific repository configuration.

VERY IMPORTANT: Also something that is very subtle that when I was trying to reproduce this, you need to do a Hard Refresh as mentioned in #12039, otherwise your changes will take no effect.

I'm facing the same issue as originally described by @jascsch. We're using Google Artifact Registry, which only supports OCI. The application points to a git repository with a helm chart. This helm chart references a dependency with the prefix oci://.

I've connected the helm repo as helm OCI registry in Argo CD. I tried both the base registry and the full path. Both registries show as successfully connected in the UI. When Argo CD tries to pull the dependency chart, it throws a 403 Forbidden error.

Is there any simple workaround without switching to an application manifest?

In my opinion, this is quite a severe bug.

We are also still facing this issue, but we are now forced to upgrade Argo version from 2.7.7 to latest.

I will focus more to this issue in few weeks and I will try to fix this and PR it here.

IMHO this is bug and i'm shocked that almost no one is facing it.

After hours I was finally able to resolve it.

This issue helped my find the right config #10823

The key is to not configure the full path of the registry as the repo URL. Any base path should work. So instead of europe-west3-docker.pkg.dev/my-rpoject/my-docker-repo/my-chart set it to europe-west3-docker.pkg.dev/my-project/my-docker-repo.

Second, it's hard to get the authentication right. In my case setting the username to _json_key and the password to the service account key file solved it. I always tried to use the private key of the key file as TLS client certificate key.

It turns out that Argo CD always shows a Helm OCI registry as successfully connected, even when the authentication fails, which is super confusing.

I'm not 100% sure about this but it seems that you also need to set the name equal to the repo URL for it to work.