openanalytics/shinyproxy-operator

Error in skipper ingress after upgrading to Kubernetes 1.20

Closed this issue · 3 comments

Hi Team,

I have recently updated EKS cluster to Kubernetes 1.20 version.
But I have noticed that after the upgrade, skipper-ingress is giving errors. I have my prod, qa and dev env running in the same cluster but in separate namespaces and it used to ran fine till I was on 1.17 K8s version. But after the upgrade, I had to tear down DEV and PROD and then recreate the env. Now, the ingress is not working for either of them and I am getting below error int he logs of skipper pod:
[APP]time="2021-08-06T20:21:07Z" level=error msg="convertPathRule: Failed to get service dev-app, sp-dev-app-shinyproxy-svc-48b8d81074f1af203a91795dfb7a74d9684d, 80"
[APP]time="2021-08-06T20:21:07Z" level=error msg="convertPathRule: Failed to get service prod-app, sp-prod-app-shinyproxy-svc-b0cdea4050e6ea521342f4a11149917441b, 80"

Strangely, my QA env is still up and running and I am able to reach to app without any issues.
Do I need to make any change for this new version upgrade?
My shinyproxy-operator is running in shinyproxy-operator namespace while skipper is running in kube-system namespace.
Any ideas?

Hi Team,

On further investigation, I found that the the dev and prod services which the skipper ingress is trying to search does not exist anymore and that is why, I was not able to reach the respective env applications.
In order to resolve this issue, I had deleted the shinyproxy-operator which was running and then created it again. Post that, everything started to work again.
Please find the chronology of events below:

  1. Upgraded the EKS cluster from 1.17 > 1.18 > 1.19 > 1.20 --> All the env applications were working fine and service is available for each namespace.
  2. Deleted the dev and prod application and created it again using argocd --> Noticed both the env stopped working i.e., pods are running fine but the dev and prod applications are not reachable. On further investigation, it was found that after creation of dev and prod shinyproxy env, the corresponding service is not created by Shinyproxy-operator which is causing this issue.
  3. In order to resolve this issue, after every new creation of application, I need to delete the shinyproxy-operator and then recreate it.
  4. I tried to repeat the operation again by deleting the dev instance of shinyproxy but after I recreated the shinyproxy-operator again after the upgrade, the service was automatically created like it used to do before the upgrade.

So the issue stands fixed but still something to look at for improving the shinyproxy-operator.

Thank you

Hello,
Adding further to the above statement, after the upgrade to the 1.20 version, its taking quite a considerable amount of time to create the service and thus it takes sometime before I could actually start using the application even though pods start running only after 2 min.
I have also noticed that the above issue of service not getting created after the delete and create of application still persists even in this version and it can be fixed only again deleting and creating shinyproxy-operator.
Please look into this.

Hi

I tested the operator on EKS v1.20 and everything works fine on our end. Can you try with the latest version of the operator (0.1.0-SNAPSHOT-20210729.111625). We fixed a bug where it could be possible that the service wasn't properly created. If you still experience the issue with this release, please provide the logs of the operator.