gorilla/websocket

Graceful shutdown of the websocket connection.

markusthoemmes opened this issue Β· 10 comments

It seems like to achieve a "normal" shutdown of a websocket connection, one needs to do a dance like:

  1. Send Close message
  2. Wait for receiving a Close message
  3. Actually close the connection

The Close method of today closes the connection immediately, without warning or graceful shutdown. Would it be desirable to have a Shutdown method like golang's http servers, which does the dance explained above for the user?

I think this is desirable to have, as well as in-built support for checking liveness using ping/pong and deadlines. Most applications need these features.

The current API mostly handles the wire format. The only thing the package does above the wire format is respond to ping and respond to close. This low-level API provides flexibility for the applications that need it (like the one I work on), but most don't need this flexibility.

This package can benefit from a high-level API that wraps up all these things that applications need: shutdown, checking liveness, concurrent sends, ...

@garyburd Can you please assign this issue to me. I would like to work on it.

@ankur0493 Please post a proposed API and overview of the implementation for discussion.

I tried to assign you to this issue, but Github will not let me do it. Please let me know what I need to do.

Thanks @garyburd. Will soon share the API details.

This feature is a helper method. It's possible for applications to implement the RFC today. If an application does not initiate the closing handshake, then the application probably implements the RFC without doing anything special.

Based on discussion in #487, here's my summary of the design:

To avoid concurrent reads on the connection, the following two cases must be handled differently:

  • Shutdown is executed from the reading goroutine
  • Shutdown in executed from some other goroutine.

The common code for both cases starts with:

err := c.WriteControlMessage(CloseMessage,FormatCloseMessage(code, msg))
if err != nil && err != ErrCloseSent {
      // If close message could not be sent, then close without the handshake.
      return c.Close()
}

The remaining code for the reading goroutine case is:

 // To prevent ping, pong and close handlers from setting the read deadline, set these handlers to the default.
c.SetPingHandler(nil)
c.SetPongHandler(nil)
c.SetCloseHandler(nil)
c.SetReadDeadline(timeout)
for {
    if _, _, err := c.NextReader(); err != nil {
        break
    }
}
return c.Close()

The remaining code for shutdown from another goroutine is:

    select {
    case <- c.readDone:
    case <- time.After(timeout):
    }
    return c.Close()

where c.readDone is a channel that's closed when c.readErr is changed from nil to a non-nil value.

For someone who wants this now, I've implemented it in nhooyr.io/websocket@v1.7.0

See discussion in coder/websocket#160

@canelohill @tebuka would be good to get your thoughts on this PR.

The issue lists the steps to gracefully close a connection as:

  1. Send Close message
  2. Wait for receiving a Close message
  3. Actually close the connection

The API for (1) exists. Write a close message using WriteControlMessage, WriteMessage or NextWriter.

Correct applications read the connection in a loop until an error is returned. These applications close the connection when the read loop exits.

Because close messages are returned as errors, correct applications implement (2) and (3) without writing code specific to graceful shutdown.

No new APIs are required to gracefully close the connection using the steps listed above.

Step (2) is not correct. It should be:

  1. Wait for receiving a Close message or a timeout expires.

The timeout is the missing feature in this package. The package should provide a way for the application to specify the timeout. When a close message is sent and a timeout is specified, the connection should compute a maximum read deadline from the timeout and enforce that maximum.

I am uncertain about the best API for specifying the timeout. Some options are:

  • Add timeout field to Dialer and Upgrader. A con of this approach is that the timeout is fixed for the lifetime of the connection.
  • Add new connection method to set the timeout.
  • Add new method to send a close message w/ a timeout argument.

A proposed API:

// StartClosingHandshake initiates the closing handshake by sending a close message with
// the specified code and text to the peer. The application must continue reading the connection
// until an error is returned. Read on the connection returns a CloseError if the peer responded
// to handshake or a timeout error if a response from the peer is not received before the timeout.
// StartClosingHandshake can be called concurrently with all other connection methods.
func (c *Conn) StartClosingHandshake(closeCode int, text string, timeout time.Duration) error {}

A writing goroutine can call this method and return with the knowledge that the reading goroutine will encounter an error and close the connection.

A reading goroutine can call this method and continue processing messages until a close message is received from the peer or a timeout. A reading goroutine can ignore messages after starting the handshake by dropping into a loop that reads and discards messages.

An important aspect of this proposal is that the application can continue reading and processing messages until the peer says that the peer is done sending messages. The shutdown is not "graceful" if the peer's message stream is chopped off before the peer is done sending messages.