canonical/seldon-core-operator

Handle ConfigMap created by workload container during application remove

i-chvets opened this issue · 1 comments

Description

Handle ConfigMap created by workload container during application remove

There is a ConfigMap created by workload container seldon-core to track its leadership:

kubectl -n <namespace> get configmap a33bd623.machinelearning.seldon.io -o=yaml
apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"seldon-controller-manager-5ff5b59788-r9c5b_d5372a80-cf0b-49ce-88d6-4c33a1096f28","leaseDurationSeconds":15,"acquireTime":"2023-03-23T18:52:51Z","renewTime":"2023-03-23T18:54:15Z","leaderTransitions":1}'
  creationTimestamp: "2023-03-23T18:49:51Z"
  labels:
    app.juju.is/created-by: seldon-controller-manager
  name: a33bd623.machinelearning.seldon.io
  namespace: kf
  resourceVersion: "4439"
  uid: 7dfa723e-5286-435b-a633-84a25f340506

This ConfigMap has expiration time of 45 seconds.

Initial problem was detected when testing upgrade: deploying stable charm and then upgrading to updated one with 45 seconds failed upgrade due to container inability to acquire lock on ConfigMap above:

 error retrieving resource lock kf/a33bd623.machinelearning.seldon.io

If upgrade is performed outside of expiration window it succeeds.

On application removal this ConfigMap (a33bd623.machinelearning.seldon.io) is not removed. It should be removed.

Seldon-core hardcodes the name of that ConfigMap:

seldon-core$ grep a33bd623  * -rn
helm-charts/seldon-core-operator/values.yaml:88:  leaderElectionID: a33bd623.machinelearning.seldon.io
helm-charts/seldon-core-operator/README.md:79:| manager.leaderElectionID | string | `"a33bd623.machinelearning.seldon.io"` |  |
operator/bundle/manifests/seldon-operator.clusterserviceversion.yaml:450:                  value: a33bd623.machinelearning.seldon.io
operator/bundle-certified/manifests/seldon-operator-certified.clusterserviceversion.yaml:450:                  value: a33bd623.machinelearning.seldon.io
operator/config/manager/manager.yaml:47:          value: "a33bd623.machinelearning.seldon.io"
operator/main.go:57:	leaderElectionIDDefault = "a33bd623.machinelearning.seldon.io"