hashicorp/raft

What happens when FSM.Apply fails?

Closed this issue · 9 comments

The function looks like Apply(*Log) interface{}. It does not support returning an error. What is the best practice for treating failure scenarios in this case? If Apply fails in your FSM implementation, what do you do? How do you notify Raft that this failed?

I think that in those cases you should panic or abort the node, since your FSM won't be able to apply further entries. Bubbling up the error wouldn't help, since the Raft algorithm has no mechanism to deal with these situations (that is, the mechanism is to let the node die).

Note that the only reasonable failure to apply an entry to the FSM should be basically an out-of-memory error (which in Go automatically leads to a panic), since there should generally be no network or disk I/O involved. All other errors should only be programming bugs (so panic'ing is a good choice), or "application" errors that belong to the domain of the FSM and that are deterministic (e.g. the input provided by the user is invalid), and those should be handled internally by the FSM (e.g. discard the command, return an object that encapsulate a description of the error, or whatever).

I see, ok. This makes sense. But what happens when a node restarts? Isn't it up to the FSM to load all the applied entries? Do I understand it correctly that the log stores shouldn't have anything to do with the FSM application logic?

Or does it re-apply when it restarts? And if that is the case, it is fine if FSM.Apply only does in-memory operations, right? I notice that FSM.Apply is called repeatedly on restart. Are snapshots ever used to reload if it's a lot of data? When there is a lot of data, rebuilding the state using Apply from the beginning of time would take a considerable amount of time.

When a node restarts you have to apply to your FSM the latest snapshot plus all committed entries that were submitted after the latest snapshot occurred. This will be done automatically by the hashicorp's implementation (which basically will call your FSM's Restore() and Apply() methods during the start up phase).

The log stores are indeed unrelated to the FSM. The FSM only applies committed entries, the store persists also entries that haven't been committed yet (because they are currently reaching a quorum).

The FSM should typically do only in-memory operations, yes. As said, at startup hashicorp's implementation will use the latest available snapshot, to reduce the overhead of rebuilding the FSM .

Hope that's clear.

If you haven't already, I think you should really read Ongaro's paper on raft.

Yes, that makes perfect sense. Thank you.

Hello! After discussion with the Vault and Nomad teams, we agreed that @freeekanayaka is correct: the node should panic because there's no failure mechanism in Raft for addressing errors. Additionally, the FSM should only ever yield an error/panic in very rare cases like running out of disk or running out of memory, in which case it makes the most sense to crash the node.

This is done in Nomad presently.

I'm going to close this issue because we don't plan to support returning an error, preferring the panic as statement above. :)

what about a timeout? a multi-threaded / loaded service implemented with multiple go routines could exhibit de facto non-determinism. if... given sufficient time... the operation would succeed on each node, but in one case, it does not, the operation would leave the node in an inconsistent state.

Hi @jkassismz , could you clarify a bit more for me?

It sounds to me like in the scenario you describe: if the operation fails, leaving the node in an inconsistent state, then it indeed should panic.

Hi @freeekanayaka @RobbieMcKinstry .
What happens in process of fsm.apply many logs panic in this time lastApplied index has been update to latest?
Will fsm lose not applied logs?

Hi @vision9527 ! I'm afraid I'm no longer on this project, and I'm not sure what the answer is. I'm sure someone else from the Consul team will be able to reply to you shortly. Best of luck!