`readyz` should reflect `InformerManager` and `FederatedInformerManager`'s cache sync status
gary-lgy opened this issue · 3 comments
InformerManager
and FederatedInformerManager
manage informers for the host and member clusters, respectively, based on FederatedTypeConfig
s. For each FederatedTypeConfig
, the managers start an informer for the corresponding type.
If the informer for a type cannot be started or its cache cannot be synced (be it for host cluster or member clusters), we should expose this information for visibility. The current mechanism for doing so is by including the information in the response returned by the controller manager's readyz
endpoint. Refer to existing controller's readyz implementation for how one can implement cache sync check for InformerManager
and FederaterdInformerManager
.
Hi @gary-lgy Can I have a chance to try this issue?
I found InformerManager
and FederatedInformerManager
are created in createControllerContext, which means the initialization logic is different from the existing controllers (created and started in core.go), so wondering if we can just directly add readyz checkers like:
healthCheckHandler.AddReadyzChecker("informer-manager", func(_ *http.Request) error {
if controllerCtx.InformerManager.HasSynced() {
return nil
}
return fmt.Errorf("informer-manager not ready")
})
healthCheckHandler.AddReadyzChecker("federated-informer-manager", func(_ *http.Request) error {
if controllerCtx.FederatedInformerManager.HasSynced() {
return nil
}
return fmt.Errorf("federated-informer-manager not ready")
})
Hi @z1cheng, please feel free to contribute :)
We could add the checkers in createControllerContext
as you suggested, however the informer managers' HasSynced
method only exposes whether the FederatedTypeConfig
and/or FederatedCluster
caches are synced. This is important, certainly, but we're also interested in the status of the informers managed by the informer managers. Sometimes the reason a resource cannot be successfully dispatched is because the informer for that type is not synced. We want to expose this information for better visibility. The desired readyz output may look something like this:
% curl "localhost:11257/readyz?verbose"
[+]controller-automigration ok
[+]controller-federate ok
...
[+]informer-deployment.apps ok
[+]informer-secrets ok
...
[+]federatedinformer-member1-secrets ok
[+]federatedinformer-member1-deployments.apps ok
[+]federatedinformer-member2-secrets ok
[+]federatedinformer-member2-deployments.apps ok
Would you like to implement this?
Sure, it's more interesting than I thought!
/assign