active-monitor running workflows more frequently than the configuration
RaviHari opened this issue · 0 comments
RaviHari commented
Describe the bug
Active-Monitor workflows are run continuously if there are errors in updating Custom Resources with a storage error or api server being busy etc.,
The timers then are not stopped causing leaks and a number of workflow pods getting created.
Expected behavior
The CR update if failed should the timers should be stopped and reqeued.
Logs
2021-02-22T13:57:55.825Z ERROR controllers.HealthCheck Error updating healthcheck resource {"HealthCheck": "monitoring/dns-healthcheck", "error": "Operation cannot be fulfilled on healthchecks.activemonitor.keikoproj.io \"dns-healthcheck\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/go-logr/zapr.(*zapLogger).Error
/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
github.com/keikoproj/active-monitor/controllers.(*HealthCheckReconciler).watchWorkflowReschedule
/workspace/controllers/healthcheck_controller.go:525
github.com/keikoproj/active-monitor/controllers.(*HealthCheckReconciler).createSubmitWorkflowHelper.func1
/workspace/controllers/healthcheck_controller.go:391
2021-02-22T14:58:59.848Z ERROR controllers.HealthCheck Error updating healthcheck resource {"HealthCheck": "monitoring/dns-healthcheck", "error": "Operation cannot be fulfilled on healthchecks.activemonitor.keikoproj.io \"dns-healthcheck\": StorageError: invalid object, Code: 4, Key: /registry/activemonitor.keikoproj.io/healthchecks/monitoring/dns-healthcheck, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: xxxx, UID in object meta: "}```