eksctl-io/eksctl

[Bug] EKSCTL 0.189.0 panic: runtime error: invalid memory address or nil pointer dereference

ttirtawi opened this issue · 2 comments

What were you trying to accomplish?

Using the standard YAML file, I always see the following warning about IRSA deprecation:

2024-08-28 13:50:25 [!]  recommended policies were found for "vpc-cni" addon, but since OIDC is disabled on the cluster, eksctl cannot configure the requested permissions; the recommended way to provide IAM permissions for "vpc-cni" addon is via pod identity associations; after addon creation is completed, add all recommended policies to the config file, under `addon.PodIdentityAssociations`, and run `eksctl update addon`

2024-08-28 13:56:59 [!]  IRSA has been deprecated; the recommended way to provide IAM permissions for "aws-efs-csi-driver" addon is via pod identity associations; after addon creation is completed, run `eksctl utils migrate-to-pod-identity`


2024-08-28 13:58:31 [!]  IRSA has been deprecated; the recommended way to provide IAM permissions for "aws-ebs-csi-driver" addon is via pod identity associations; after addon creation is completed, run `eksctl utils migrate-to-pod-identity`

So I'm trying to create a new cluster with Addon and pod identity agent simultaneously so that I don't need to do migrate-to-pod-identity after cluster has been created, but the process crash due to SIGSEGV when creating vpc-cni addon.

What happened?

This is the error I got

2024-08-28 20:55:34 [ℹ]  eksctl version 0.189.0
2024-08-28 20:55:34 [ℹ]  using region ap-southeast-2
2024-08-28 20:55:35 [ℹ]  setting availability zones to [ap-southeast-2a ap-southeast-2c ap-southeast-2b]
2024-08-28 20:55:35 [ℹ]  subnets for ap-southeast-2a - public:192.168.0.0/19 private:192.168.96.0/19
2024-08-28 20:55:35 [ℹ]  subnets for ap-southeast-2c - public:192.168.32.0/19 private:192.168.128.0/19
2024-08-28 20:55:35 [ℹ]  subnets for ap-southeast-2b - public:192.168.64.0/19 private:192.168.160.0/19
2024-08-28 20:55:35 [ℹ]  nodegroup "nodegroup1" will use "" [AmazonLinux2023/1.30]
2024-08-28 20:55:35 [ℹ]  using Kubernetes version 1.30
2024-08-28 20:55:35 [ℹ]  creating EKS cluster "tseldemo" in "ap-southeast-2" region with managed nodes
2024-08-28 20:55:35 [ℹ]  1 nodegroup (nodegroup1) was included (based on the include/exclude rules)
2024-08-28 20:55:35 [ℹ]  will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
2024-08-28 20:55:35 [ℹ]  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
2024-08-28 20:55:35 [ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-southeast-2 --cluster=tseldemo'
2024-08-28 20:55:35 [ℹ]  Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "tseldemo" in "ap-southeast-2"
2024-08-28 20:55:35 [ℹ]  CloudWatch logging will not be enabled for cluster "tseldemo" in "ap-southeast-2"
2024-08-28 20:55:35 [ℹ]  you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=ap-southeast-2 --cluster=tseldemo'
2024-08-28 20:55:35 [ℹ]
2 sequential tasks: { create cluster control plane "tseldemo",
    2 sequential sub-tasks: {
        5 sequential sub-tasks: {
            1 task: { create addons },
            wait for control plane to become ready,
            associate IAM OIDC provider,
            no tasks,
            update VPC CNI to use IRSA if required,
        },
        create managed nodegroup "nodegroup1",
    }
}
2024-08-28 20:55:35 [ℹ]  building cluster stack "eksctl-tseldemo-cluster"
2024-08-28 20:55:37 [ℹ]  deploying stack "eksctl-tseldemo-cluster"
2024-08-28 20:56:07 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:56:38 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:57:38 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:58:39 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:59:39 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:00:41 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:01:41 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:02:42 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:03:43 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:03:48 [ℹ]  creating addon
2024-08-28 21:03:48 [ℹ]  successfully created addon
2024-08-28 21:03:49 [ℹ]  "addonsConfig.autoApplyPodIdentityAssociations" is set to true; will lookup recommended pod identity configuration for "vpc-cni" addon
2024-08-28 21:03:49 [ℹ]  deploying stack "eksctl-tseldemo-addon-vpc-cni-podidentityrole-aws-node"
2024-08-28 21:03:49 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-addon-vpc-cni-podidentityrole-aws-node"
2024-08-28 21:04:20 [ℹ]  waiting for CloudFormation stack "eksctl-tseldemo-addon-vpc-cni-podidentityrole-aws-node"
2024-08-28 21:04:20 [ℹ]  creating addon
2024-08-28 21:04:22 [ℹ]  successfully created addon
2024-08-28 21:04:23 [ℹ]  creating addon
2024-08-28 21:04:23 [ℹ]  successfully created addon
2024-08-28 21:04:24 [ℹ]  creating addon
2024-08-28 21:04:25 [ℹ]  successfully created addon
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x20 pc=0x105ad8d58]

goroutine 189 [running]:
github.com/weaveworks/eksctl/pkg/actions/addon.(*Manager).Update(0x14000578640, {0x10816d3a8, 0x10a8db2e0}, 0x1400033ed20, {0x0, 0x0}, 0x15d3ef79800)
	github.com/weaveworks/eksctl/pkg/actions/addon/update.go:121 +0xeb8
github.com/weaveworks/eksctl/pkg/actions/addon.CreateAddonTasks.func3()
	github.com/weaveworks/eksctl/pkg/actions/addon/tasks.go:111 +0x90
github.com/weaveworks/eksctl/pkg/utils/tasks.(*GenericTask).Do(0x14000807158, 0x0?)
	github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:31 +0x34
github.com/weaveworks/eksctl/pkg/utils/tasks.doSingleTask(0x14000186960?, {0x10811d000, 0x14000807158})
	github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:202 +0xc8
github.com/weaveworks/eksctl/pkg/utils/tasks.doSequentialTasks(0x0?, {0x1400050c880, 0x5, 0x0?})
	github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:250 +0x6c
created by github.com/weaveworks/eksctl/pkg/utils/tasks.(*TaskTree).Do in goroutine 187
	github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:158 +0x258

How to reproduce it?

Create the cluster using eksctl create cluster -f cluster-SYD.yaml, the YAML file as follow

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: tseldemo
  region: ap-southeast-2
  version: latest
  tags:
    karpenter.sh/discovery: tseldemo

karpenter:
  version: '1.0.1'
  createServiceAccount: true

managedNodeGroups:
- name: nodegroup1
  instanceType: m6i.large
  privateNetworking: true
  desiredCapacity: 1
  iam:
    withAddonPolicies:
      albIngress: true
      autoScaler: true
      cloudWatch: true
      ebs: true
      efs: true
      fsx: true
      imageBuilder: true
      xRay: true
      awsLoadBalancerController: true
iam:
  withOIDC: true
  vpcResourceControllerPolicy: true

addons:
- name: vpc-cni
  version: latest
  useDefaultPodIdentityAssociations: true
- name: kube-proxy
  version: latest
  useDefaultPodIdentityAssociations: true
- name: coredns
  version: latest
  useDefaultPodIdentityAssociations: true
- name: aws-efs-csi-driver
  version: latest
  useDefaultPodIdentityAssociations: true
- name: eks-pod-identity-agent
  version: latest

Logs

Anything else we need to know?

I use eksctl in MacOS 14.6.1, install it view homebrew.

Versions

$ eksctl info
eksctl version: 0.189.0
kubectl version: v1.29.1
OS: darwin
cPu1 commented

@ttirtawi, we have identified the issue and will work on a fix soon. In the meantime, I'd recommend working around this by removing iam.withOIDC as you do not seem to be using IRSA.

@ttirtawi, we have identified the issue and will work on a fix soon. In the meantime, I'd recommend working around this by removing iam.withOIDC as you do not seem to be using IRSA.

I encountered the same crash and can confirm that this workaround works. 👍

However, I think there might be a related documentation issue on this page. The following sentence led me to assume that IRSA/OIDC was still required, so maybe it could be clarified:

Pod Identity Association leverages IRSA, however, it makes it configurable directly through EKS API, eliminating the need for using IAM API altogether.