tinkerbell/cluster-api-provider-tinkerbell

Implement retries for BMC interactions

Opened this issue · 0 comments

BMCs are known to fail/act oddly. CAPT uses Rufio when BMC data is referenced by the Hardware resource to power machines off/on and configure netboot. The Rufio Tasks/Jobs indicate whether they failed or succeeded. For increased resiliancy we should consider implementing retries in CAPT for the Rufio interactions.