mtrudel/bandit

Phoenix LiveView doesn't handle server restart gracefully

derekkraan opened this issue · 3 comments

Hi @mtrudel,

First off, bandit is really looking great, excellent work on the whole thing!

I filed a bug report here with LiveView, but Jose asked me to file an issue with bandit instead.

The basic run-down of what is happening is:

When the server is restarting, bandit closes the websocket connection with status code 1001 (meaning: server going away). LiveView, however, is interpreting this as "browser navigated to another page" and subsequently does not attempt to reconnect, leaving the user with a spinning cursor until they reload the page manually.

Returning a status code 1000 instead would solve the issues with Phoenix. I'm not sure what your take is on that though from the perspective of standard compliance.

I agonized over this corner of behaviour for awhile when I was working up the first pass at WebSocket support - I've always understood RFC 6455§7.4.1 to prescribe 1001 as the appropriate close code for a server going down. The spec is a bit vague on specifics here though, so it's definitely open to interpretation. Let's look at some prior art, noting that 1000 is specified to 'indicate[] a normal closure, meaning that the purpose for which the connection was established has been fulfilled':

  • Cowboy uses 1000 for stop and timeout cases. Setting aside the semantics of the timeout case (which is pretty clearly incorrect), it's arguable that a 1000 is correct for the server shutting down.
  • The ws library also uses 1000 for the shutdown case
  • Autobahn isn't prescriptive one way or the other

The above combined with the inherent vagueness of the spec makes this seem like a bit of a silly hill to die on; I'll be updating Bandit's behaviour to return 1000 in this case shortly.

Fixed in 0.6.8!

Thanks for the quick turnaround on this, will push that to prod now. Shortest-lived fork I have probably ever had to run in prod.