hashicorp/raft

Allow network transfer of snapshot to be cancelled during shutdown

mauri870 opened this issue · 0 comments

We were discussing an issue in rqlite/rqlite#1695 where a raft.Shutdown() prevents the program from exiting, since it is in the middle of a network transfer of the snapshot. If said snapshot is in the order of gigabytes this transfer step can take a long time.

Looking at the source code, it seems that raft blocks during the network transfer and does not check the shutdownCh channel to quit if a shutdown was initiated. There is a check for the shutdown signal in the restore future, but that happens after the transfer operation takes place.

I have a proposed fix here, I would be happy to submit a PR if it makes sense.

Another option would be to convert the network transfer into a future, but that may not be necessary.