keikoproj/active-monitor

active-monitor running workflows more frequently than the configuration

RaviHari opened this issue · 0 comments

Describe the bug
Active-Monitor workflows are run continuously if there are errors in updating Custom Resources with a storage error or api server being busy etc.,
The timers then are not stopped causing leaks and a number of workflow pods getting created.

Expected behavior
The CR update if failed should the timers should be stopped and reqeued.

Logs

2021-02-22T13:57:55.825Z	ERROR	controllers.HealthCheck	Error updating healthcheck resource	{"HealthCheck": "monitoring/dns-healthcheck", "error": "Operation cannot be fulfilled on healthchecks.activemonitor.keikoproj.io \"dns-healthcheck\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
github.com/keikoproj/active-monitor/controllers.(*HealthCheckReconciler).watchWorkflowReschedule
	/workspace/controllers/healthcheck_controller.go:525
github.com/keikoproj/active-monitor/controllers.(*HealthCheckReconciler).createSubmitWorkflowHelper.func1
	/workspace/controllers/healthcheck_controller.go:391

2021-02-22T14:58:59.848Z	ERROR	controllers.HealthCheck	Error updating healthcheck resource	{"HealthCheck": "monitoring/dns-healthcheck", "error": "Operation cannot be fulfilled on healthchecks.activemonitor.keikoproj.io \"dns-healthcheck\": StorageError: invalid object, Code: 4, Key: /registry/activemonitor.keikoproj.io/healthchecks/monitoring/dns-healthcheck, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: xxxx, UID in object meta: "}```