Error in ReplicaSet "request did not complete within requested timeout" installing addon.
Closed this issue · 2 comments
/kind bug
What happened?
Upgrading from "v1.29.1-eksbuild.1" -> "v1.30.0-eksbuild.1" the update was timing out. I removed the added on then tried reinstalling and the addon is now stuck on "Creating". Upon further investigation is appears the replica set for the controller is timing out during creation of the controller pods. The replica set shows this error in the events.
Error creating: Timeout: request did not complete within requested timeout - context deadline exceeded
Is there somewhere I can look for additional details on what is causing the timeout?
What you expected to happen?
Expected the EBS CSI Driver to install successfully.
How to reproduce it (as minimally and precisely as possible)?
I am not sure exactly, this seems to be exclusive to a single cluster.
Anything else we need to know?:
ReplicaSet Describe
Name: ebs-csi-controller-854b999fdc
Namespace: kube-system
Selector: app=ebs-csi-controller,app.kubernetes.io/name=aws-ebs-csi-driver,pod-template-hash=854b999fdc
Labels: app=ebs-csi-controller
app.kubernetes.io/component=csi-driver
app.kubernetes.io/managed-by=EKS
app.kubernetes.io/name=aws-ebs-csi-driver
app.kubernetes.io/version=1.30.0
pod-template-hash=854b999fdc
Annotations: deployment.kubernetes.io/desired-replicas: 2
deployment.kubernetes.io/max-replicas: 3
deployment.kubernetes.io/revision: 1
Controlled By: Deployment/ebs-csi-controller
Replicas: 0 current / 2 desired
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=ebs-csi-controller
app.kubernetes.io/component=csi-driver
app.kubernetes.io/managed-by=EKS
app.kubernetes.io/name=aws-ebs-csi-driver
app.kubernetes.io/version=1.30.0
pod-template-hash=854b999fdc
Service Account: ebs-csi-controller-sa
Containers:
ebs-plugin:
Image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/aws-ebs-csi-driver:v1.30.0
Port: 9808/TCP
Host Port: 0/TCP
Args:
controller
--endpoint=$(CSI_ENDPOINT)
--k8s-tag-cluster-id=guardian-prod-primary
--batching=true
--logging-format=text
--user-agent-extra=eks
--v=2
Limits:
memory: 256Mi
Requests:
cpu: 10m
memory: 40Mi
Liveness: http-get http://:healthz/healthz delay=10s timeout=3s period=10s #success=1 #failure=5
Readiness: http-get http://:healthz/healthz delay=10s timeout=3s period=10s #success=1 #failure=5
Environment:
CSI_ENDPOINT: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
CSI_NODE_NAME: (v1:spec.nodeName)
AWS_ACCESS_KEY_ID: <set to the key 'key_id' in secret 'aws-secret'> Optional: true
AWS_SECRET_ACCESS_KEY: <set to the key 'access_key' in secret 'aws-secret'> Optional: true
AWS_EC2_ENDPOINT: <set to the key 'endpoint' of config map 'aws-meta'> Optional: true
AWS_REGION: us-east-1
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
csi-provisioner:
Image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/csi-provisioner:v4.0.1-eks-1-30-2
Port: <none>
Host Port: <none>
Args:
--timeout=60s
--csi-address=$(ADDRESS)
--v=2
--feature-gates=Topology=true
--extra-create-metadata
--leader-election=true
--default-fstype=ext4
--kube-api-qps=20
--kube-api-burst=100
--worker-threads=100
Limits:
memory: 256Mi
Requests:
cpu: 10m
memory: 40Mi
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
csi-attacher:
Image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/csi-attacher:v4.5.1-eks-1-30-2
Port: <none>
Host Port: <none>
Args:
--timeout=60s
--csi-address=$(ADDRESS)
--v=2
--leader-election=true
--kube-api-qps=20
--kube-api-burst=100
--worker-threads=100
Limits:
memory: 256Mi
Requests:
cpu: 10m
memory: 40Mi
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
csi-snapshotter:
Image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/csi-snapshotter:v7.0.2-eks-1-30-2
Port: <none>
Host Port: <none>
Args:
--csi-address=$(ADDRESS)
--leader-election=true
--extra-create-metadata
--kube-api-qps=20
--kube-api-burst=100
--worker-threads=100
Limits:
memory: 256Mi
Requests:
cpu: 10m
memory: 40Mi
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
csi-resizer:
Image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/csi-resizer:v1.10.1-eks-1-30-2
Port: <none>
Host Port: <none>
Args:
--timeout=60s
--csi-address=$(ADDRESS)
--v=2
--handle-volume-inuse-error=false
--leader-election=true
--kube-api-qps=20
--kube-api-burst=100
--workers=100
Limits:
memory: 256Mi
Requests:
cpu: 10m
memory: 40Mi
Environment:
ADDRESS: /var/lib/csi/sockets/pluginproxy/csi.sock
Mounts:
/var/lib/csi/sockets/pluginproxy/ from socket-dir (rw)
liveness-probe:
Image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/livenessprobe:v2.12.0-eks-1-30-2
Port: <none>
Host Port: <none>
Args:
--csi-address=/csi/csi.sock
Limits:
memory: 256Mi
Requests:
cpu: 10m
memory: 40Mi
Environment: <none>
Mounts:
/csi from socket-dir (rw)
Volumes:
socket-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
Priority Class Name: system-cluster-critical
Conditions:
Type Status Reason
---- ------ ------
ReplicaFailure True FailedCreate
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 4m24s (x18 over 22m) replicaset-controller Error creating: Timeout: request did not complete within requested timeout - context deadline exceeded
Environment
- Kubernetes version (use
kubectl version
): 1.28 - Driver version: v1.30.0-eksbuild.1
@ConnorJC3: Closing this issue.
In response to this:
/close
dupe of #2046
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.