Failed on deploying Yatai to EKS
bobmayuze opened this issue · 5 comments
I followed the documentation and have encountered several issues
- Failed on deploying postgreSQL
$ k events pods/postgresql-ha-postgresql-0 -n yatai-system
LAST SEEN TYPE REASON OBJECT MESSAGE
35m (x208 over 69m) Warning Unhealthy Pod/yatai-6899664d9c-l2f6l Readiness probe failed: Get "http://172.31.89.159:7777/": dial tcp 172.31.89.159:7777: connect: connection refused
10m (x295 over 69m) Warning Unhealthy Pod/yatai-6899664d9c-l2f6l Liveness probe failed: Get "http://172.31.89.159:7777/": dial tcp 172.31.89.159:7777: connect: connection refused
9m8s (x11 over 110m) Warning FailedScheduling Pod/postgresql-ha-postgresql-1 running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
9m8s (x11 over 110m) Warning FailedScheduling Pod/postgresql-ha-postgresql-2 running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
5m3s (x184 over 60m) Warning BackOff Pod/yatai-6899664d9c-l2f6l Back-off restarting failed container
3m51s (x9 over 84m) Warning FailedScheduling Pod/postgresql-ha-postgresql-0 running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
26s (x482 over 120m) Normal ExternalProvisioning PersistentVolumeClaim/data-postgresql-ha-postgresql-0 waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
26s (x482 over 120m) Normal ExternalProvisioning PersistentVolumeClaim/data-postgresql-ha-postgresql-1 waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
26s (x483 over 120m) Normal ExternalProvisioning PersistentVolumeClaim/data-postgresql-ha-postgresql-2 waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
Then I turned to setup aws rds to keep the process going. yet the final stage on pulling up yatai still failed
2. Failed on pulling up yatai
$ k events pods/yatai-6899664d9c-l2f6l -n yatai-system
LAST SEEN TYPE REASON OBJECT MESSAGE
36m (x208 over 71m) Warning Unhealthy Pod/yatai-6899664d9c-l2f6l Readiness probe failed: Get "http://172.31.89.159:7777/": dial tcp 172.31.89.159:7777: connect: connection refused
11m (x295 over 71m) Warning Unhealthy Pod/yatai-6899664d9c-l2f6l Liveness probe failed: Get "http://172.31.89.159:7777/": dial tcp 172.31.89.159:7777: connect: connection refused
5m22s (x9 over 86m) Warning FailedScheduling Pod/postgresql-ha-postgresql-0 running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
117s (x482 over 121m) Normal ExternalProvisioning PersistentVolumeClaim/data-postgresql-ha-postgresql-0 waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
117s (x482 over 121m) Normal ExternalProvisioning PersistentVolumeClaim/data-postgresql-ha-postgresql-1 waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
117s (x483 over 121m) Normal ExternalProvisioning PersistentVolumeClaim/data-postgresql-ha-postgresql-2 waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
86s (x196 over 62m) Warning BackOff Pod/yatai-6899664d9c-l2f6l Back-off restarting failed container
28s (x12 over 111m) Warning FailedScheduling Pod/postgresql-ha-postgresql-1 running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
28s (x12 over 111m) Warning FailedScheduling Pod/postgresql-ha-postgresql-2 running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
Any help how can I keep it going? Thanks
First, you should check the PVC status:
kubectl -n yatai-system get pvc
If pvc are pending or failed, you should describe the PVC to get the reseaon:
kubectl -n yatai-system describe pvc $pvcName
Maybe you don't have any storageclass in your cluster or you do not have any storageclass provisioner
refs: https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/
I tried to describe the pvc and found out this
kubectl -n yatai-system get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-postgresql-ha-postgresql-0 Pending gp2 13h
data-postgresql-ha-postgresql-1 Pending gp2 13h
data-postgresql-ha-postgresql-2 Pending gp2 13h
and to get a further detailed description, i did k -n yatai-system describe pvc data-postgresql-ha-postgresql-0
and got this
Name: data-postgresql-ha-postgresql-0
Namespace: yatai-system
StorageClass: gp2
Status: Pending
Volume:
Labels: app.kubernetes.io/component=postgresql
app.kubernetes.io/instance=postgresql-ha
app.kubernetes.io/name=postgresql-ha
Annotations: volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
volume.kubernetes.io/selected-node: ip-172-31-21-121.ec2.internal
volume.kubernetes.io/storage-provisioner: ebs.csi.aws.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: postgresql-ha-postgresql-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ExternalProvisioning 3m25s (x3203 over 13h) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
So I went to describe the storage class gp2, and got this
$ k describe storageclass gp2
Name: gp2
IsDefaultClass: Yes
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"gp2"},"parameters":{"fsType":"ext4","type":"gp2"},"provisioner":"kubernetes.io/aws-ebs","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner: kubernetes.io/aws-ebs
Parameters: fsType=ext4,type=gp2
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: WaitForFirstConsumer
Events: <none>
Any guidance on creating the storage class here? This part was not mentioned in the installation doc
You should follow this AWS official documentation to setup the CSI driver on EKS and enable OIDC IAM in existing EKS Cluster
https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html
https://stackoverflow.com/a/68725742
it worked, but I have to re-install yatai after the ebs-csi driver has been configured
@bobmayuze It is not necessary to reinstall, but to recreate the pvc before it is recognized by volume provisioner