[efs-provisioner] EKS - timeout on mount
mlaythe opened this issue · 6 comments
Hi,
I've been running efs provisioner on my cluster for a few days and it suddenly stopped working. It started to throw this error:
Unable to mount volumes for pod "<pod_name>)": timeout expired waiting for volumes to attach or mount for pod "prod"/"<pod_name>". list of unmounted volumes=[efs-pvc]. list of unattached volumes=[efs-pvc default-token-wxwxc]
I'm sharing an EFS drive between two namespaces, currently.
apiVersion: v1
kind: PersistentVolume
metadata:
name: demo-pv
annotations:
pv.kubernetes.io/provisioned-by: "aws-efs"
spec:
capacity:
storage: 1Mi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
storageClassName: aws-efs
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /efs-pvc-<claim_id>
server: <fs_id>.efs.us-west-2.amazonaws.com
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: demo-pv-claim
namespace: demo
annotations:
volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: prod-pv
annotations:
pv.kubernetes.io/provisioned-by: "aws-efs"
spec:
capacity:
storage: 1Mi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
storageClassName: aws-efs
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /efs-pvc-<claim_id>
server: <fs_id>.efs.us-west-2.amazonaws.com
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prod-pv-claim
namespace: prod
annotations:
volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
I'm able to manually mount it on a separate instance and cd
into the path I specified. I'm unsure on how to check the kubelet logs for EKS. Any help would be greatly appreciated, thanks!
It seems it can't mount the EFS drive if it's deployed to certain nodes, while it'll work on other nodes. What could cause this locked state on a node? Looks like getting the kubelet logs would require ssh access which I didn't configure initially, and redeploying the node group is out of the question, unfortunately.
Do your security groups https://docs.aws.amazon.com/efs/latest/ug/accessing-fs-create-security-groups.html allow NFS access?
Yeah, I added 2049 port access to the security group for my EFS drive. It used to work before, so all permissions were setup correctly prior.
kubelet logs should be handled by journald. journalctl -u kubelet
and there should be log messages about the Mount
operation failing.
I'm unable to SSH into the worker nodes because I didn't set it up initially on EKS
Sorry, it looks like my EFS instance ran out of credits, hence the weird behavior. Thanks for your help and time!