helm/charts

[stable/mongodb] MongoDB using existing PV fails to recover

nicolas-g opened this issue · 5 comments

Describe the bug
When you deploy a MongoDB replica set using service Load Balancer type for external access and populate the DB with some data by using the LoadBalancer to access mongo , if you destroy it the data is persistent as it's using Kubernetes persistent volumes.

If you redeploy the MongoDB replica set, new service LoadBalancers will be provisioned with different IP address and MongoDB fails to start as it's seems it's trying to connect to the previous Load Balancer addresses.

Version of Helm and Kubernetes:

helm version

version.BuildInfo{Version:"v3.5.3", GitCommit:"041ce5a2c17a58be0fcd5f5e16fb3e7e95fea622", GitTreeState:"dirty", GoVersion:"go1.16.2"}

kubectl version

Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.4-dirty", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"dirty", BuildDate:"2021-03-15T14:40:45Z", GoVersion:"go1.16.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.10-gke.1000", GitCommit:"fb668c07d234a3f2c6b9f7a57e030715a6074115", GitTreeState:"clean", BuildDate:"2021-04-29T09:17:21Z", GoVersion:"go1.15.10b5", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
mongodb

What happened:
New MongoDB replica set fails to start.

Old services (before deleting Mongo)

NAME                                       TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)           AGE
pinning-db-dev-use1-lge-0-external         LoadBalancer   172.20.103.208   172.20.0.118   27017:30399/TCP   12m
pinning-db-dev-use1-lge-1-external         LoadBalancer   172.20.103.84    172.20.0.119   27017:31718/TCP   12m

New services (after redeploying Mongo)

pinning-db-dev-use1-lge-0-external         LoadBalancer   172.20.103.196   172.20.0.35    27017:31874/TCP   31m
pinning-db-dev-use1-lge-1-external         LoadBalancer   172.20.103.174   172.20.0.37    27017:30599/TCP   31m

Containers logs, see references for old IPs (172.20.0.118 and 172.20.0.119):

{"t":{"$date":"2021-05-24T16:25:30.453+00:00"},"s":"I",  "c":"-",        "id":4333222, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM received failed isMaster","attr":{"host":"172.20.0.118:27017","error":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 500ms","replicaSet":"rs0","isMasterReply":"{}"}}
{"t":{"$date":"2021-05-24T16:25:30.453+00:00"},"s":"I",  "c":"NETWORK",  "id":4712102, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Host failed in replica set","attr":{"replicaSet":"rs0","host":"172.20.0.118:27017","error":{"code":202,"codeName":"NetworkInterfaceExceededTimeLimit","errmsg":"Couldn't get a connection within the time limit of 500ms"},"action":{"dropConnections":false,"requestImmediateCheck":false,"outcome":{"host":"172.20.0.118:27017","success":false,"errorMessage":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 500ms"}}}}
{"t":{"$date":"2021-05-24T16:25:30.453+00:00"},"s":"I",  "c":"-",        "id":4333222, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM received failed isMaster","attr":{"host":"172.20.0.119:27017","error":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 500ms","replicaSet":"rs0","isMasterReply":"{}"}}
{"t":{"$date":"2021-05-24T16:25:30.453+00:00"},"s":"I",  "c":"NETWORK",  "id":4712102, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Host failed in replica set","attr":{"replicaSet":"rs0","host":"172.20.0.119:27017","error":{"code":202,"codeName":"NetworkInterfaceExceededTimeLimit","errmsg":"Couldn't get a connection within the time limit of 500ms"},"action":{"dropConnections":false,"requestImmediateCheck":false,"outcome":{"host":"172.20.0.119:27017","success":false,"errorMessage":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 500ms"}}}}
{"t":{"$date":"2021-05-24T16:25:35.953+00:00"},"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Connecting","attr":{"hostAndPort":"172.20.0.119:27017"}}
{"t":{"$date":"2021-05-24T16:25:35.953+00:00"},"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Connecting","attr":{"hostAndPort":"172.20.0.118:27017"}}
{"t":{"$date":"2021-05-24T16:25:38.409+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.1:56846","connectionId":386,"connectionCount":1}}

What you expected to happen:

able to deploy a new MongoDB replica set with the same config to automatically use the existing data from the persistent volumes and recover without failures.

How to reproduce it (as minimally and precisely as possible):

  • deploy Mongo with the bellow helm values
architecture: replicaset
externalAccess:
  autoDiscovery:
    enabled: true
  enabled: true
  service:
    annotations:
      cloud.google.com/load-balancer-type: Internal
    port: 27017
    type: LoadBalancer
  • populate MongoDB with data

  • uninstall the Helm mongo release

  • install mongodb again using the same values

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Same issue after upgrading Kubernetes version.

@nicolas-g Have you found a solution to update the replicaset IPs?

Same issue after upgrading Kubernetes version.

@nicolas-g Have you found a solution to update the replicaset IPs?

Unfortunately no ...

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

This repo is deprecated; for any further questions on this topic, refer to https://github.com/bitnami/charts/tree/master/bitnami/mongodb.