ManageIQ/kubeclient

Informer.watch is not guaranteed to receive all changes during restart

DocX opened this issue · 0 comments

DocX commented

Using the Infomer.watch method that yields notices from the watch endpoint is not guaranteed to receive all events in case the informer restarts the watcher:

  1. Informer fills/replaces cache from get request and stores the resource version
  2. Informer starts watch request with the resource version from the get
  3. Watch request stops (timeout expires, connection drops, ...)
  4. Some changes happen in Kubernetes cluster before the next get request (those will not be yielded to the watch block)
  5. Repeat 1: the get will receive the state with the changes happened in 4 and update list
  6. Repeat 2: but the watch now will start after those changes, so watch will not be yielded with the changes happened in 4.

Potential solutions:

  1. Document the behaviour of that watch interface can miss changes (and that list is the only source of truth)
  2. Only request get once when informer.start_worker is called, when watch request stops, only restart the watch with the latest seen resource version