Handle ConfigMap created by workload container during application remove
i-chvets opened this issue · 1 comments
i-chvets commented
Description
Handle ConfigMap created by workload container during application remove
There is a ConfigMap created by workload container seldon-core
to track its leadership:
kubectl -n <namespace> get configmap a33bd623.machinelearning.seldon.io -o=yaml
apiVersion: v1
kind: ConfigMap
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"seldon-controller-manager-5ff5b59788-r9c5b_d5372a80-cf0b-49ce-88d6-4c33a1096f28","leaseDurationSeconds":15,"acquireTime":"2023-03-23T18:52:51Z","renewTime":"2023-03-23T18:54:15Z","leaderTransitions":1}'
creationTimestamp: "2023-03-23T18:49:51Z"
labels:
app.juju.is/created-by: seldon-controller-manager
name: a33bd623.machinelearning.seldon.io
namespace: kf
resourceVersion: "4439"
uid: 7dfa723e-5286-435b-a633-84a25f340506
This ConfigMap has expiration time of 45 seconds.
Initial problem was detected when testing upgrade: deploying stable charm and then upgrading to updated one with 45 seconds failed upgrade due to container inability to acquire lock on ConfigMap above:
error retrieving resource lock kf/a33bd623.machinelearning.seldon.io
If upgrade is performed outside of expiration window it succeeds.
On application removal this ConfigMap (a33bd623.machinelearning.seldon.io
) is not removed. It should be removed.
i-chvets commented
Seldon-core hardcodes the name of that ConfigMap:
seldon-core$ grep a33bd623 * -rn
helm-charts/seldon-core-operator/values.yaml:88: leaderElectionID: a33bd623.machinelearning.seldon.io
helm-charts/seldon-core-operator/README.md:79:| manager.leaderElectionID | string | `"a33bd623.machinelearning.seldon.io"` | |
operator/bundle/manifests/seldon-operator.clusterserviceversion.yaml:450: value: a33bd623.machinelearning.seldon.io
operator/bundle-certified/manifests/seldon-operator-certified.clusterserviceversion.yaml:450: value: a33bd623.machinelearning.seldon.io
operator/config/manager/manager.yaml:47: value: "a33bd623.machinelearning.seldon.io"
operator/main.go:57: leaderElectionIDDefault = "a33bd623.machinelearning.seldon.io"