Cannot get qMsgs if the WPA deleted and re-created
KangBK0120 opened this issue · 3 comments
Hi, I faced an issue with the controller in the master
branch.
If I create a dummy deployment and a dummy WPA, it worked without any problem.
The deployment does not do any job from SQS. It does not receive or process the messages in the queue. And the dummy WPA simply auto-scales it.
However, the controller failed to get qMsgs if I deleted the WPA and re-create it with the same YAML.
Here are the YAML files I used
apiVersion: apps/v1
kind: Deployment
metadata:
name: dummy-deployment
spec:
selector:
matchLabels:
app: dummy-deployment
replicas: 1
template:
metadata:
labels:
app: dummy-deployment
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
apiVersion: k8s.practo.dev/v1
kind: WorkerPodAutoScaler
metadata:
name: dummy-wpa
spec:
minReplicas: 1
maxReplicas: 100
deploymentName: dummy-deployment
queueURI: https://sqs.{region}.amazonaws.com/{id}/dummy-queue
targetMessagesPerWorker: 1
maxDisruption: "100%"
I did not change anything in the workerpodautoscaler deployment.
I found that the issue comes from poller.go
When a new WPA resource is created, the thread is successfully created and works fine.
However, if a user deletes the WPA, the thread is deleted but the sync function in poller still checks and holds its status.
And if a new WPA is created with the same key, the thread is not created and thus qMsgs cannot be fetched.
Therefore I changed the sync function as follows and this fixes my issue.
func (p *Poller) Sync(stopCh <-chan struct{}) {
for {
select {
case listResultCh := <-p.listThreadCh:
listResultCh <- DeepCopyThread(p.threads)
case threadStatus := <-p.updateThreadCh:
for key, status := range threadStatus {
if status == false {
delete(p.threads, key)
} else {
p.threads[key] = status
}
}
case <-stopCh:
klog.V(1).Info("Stopping sync thread of poller gracefully.")
return
}
}
}
Thanks @KangBK0120 for reporting, debugging and fixing the issue.
I have released a patch release v1.4.1 for it as it could be critical for others if they recreate with the same WPA name.
practodev/workerpodautoscaler:v1.4.1