kubedl-io/kubedl

[feature request] infrastructure anomaly auto detection and avoid to schedule pods on abnormal nodes.

Opened this issue · 0 comments

What would you like to be added:

  1. collect anomalous pod states and events, discover abnormal nodes progressively
  2. avoid to schedule pods on abnormal nodes.

Why is this needed:

  1. discover infrastructure problems proactively and make job runs with more robustly.