The final read phase failed after terminate-nemesis
Closed this issue · 0 comments
yito88 commented
What happened
Sometimes, after the nemesis requested the C* crash, the final read phase failed because some nodes were still down.
Cause
The cause is that terminate-nemesis
is not enough to recover the cluster. It just requests an operation to stop the existing failure injection. It doesn't wait for recovery. That's why the final read phase starts before the cluster was recovered.
Solution
We need to wait for the cluster recovery. We can add the wait function just before the read because terminate-nemesis
has started to recover the cluster.