tower-rs/tower-http

`tower_http::trace::OnEarlyDrop`

scottlamb opened this issue · 5 comments

Feature Request

Motivation

My service has recently had problems in which (for still-undiagnosed reasons), it sometimes takes long enough to respond that the client has hit its HTTP timeout and closed the connection before response headers were sent. We want to monitor for this condition.

tower_http::trace is a great framework for observing related problems (HTTP server/client errors, excessive latency, etc.) but appears to have a complete blind spot here. The span will end without any further log entries (and in my service's case, we've chosen not to log on receiving the request, so these timed-out requests are invisible in logs).

Proposal

From experiment, I've found hyper drops the call future on client disconnect, so the obvious way to do this is to via a Drop impl on that future. If response headers haven't been sent at drop time, call a new hook similar to tower_http::trace::OnFailure.

The default impl could log this happened and the latency; maybe it needs a few parameters:

  1. a duration threshold,
  2. the log level under that threshold, and
  3. the log level at/above that threshold.

E.g., I don't consider it "my service's problem" if the client disconnects after waiting only 10ms but definitely do if it waited 10+s. The threshold is at least service-specific; in some cases decided based in a more complex fashion by a custom impl.

Name-wise, maybe OnEarlyDrop is more appropriate than say OnClientDisconnect, because if say you have a server-side Timeout layer in front of this one, the trace future's drop will be called on timeout. OnEarlyDrop is not as descriptive of the cause I'm interested in but also shouldn't ever be misleading.

A related idea: this isn't currently a focus of mine, but some folks may also want to know if the client closed the connection after a Response was returned but before the full body was generated. I think similarly OnEos just doesn't get called and there's no alternate plugin which does.

Alternatives

Feels to me like this is better handled at the connection level where you have more control over all this.

Tower-http doesn’t know anything about connections so relying on Drop feels brittle to me.

I don’t remember exactly how much is exposed by hypers low level server APIs but I’d recommend you look at that.

How would that work? An idle connection dropping isn't important. I care about the request (maybe requests in HTTP/2 or HTTP/3) that were started but not finished.

Feels like hyper should be the one to provide that information as well. That's where the future is being dropped anyway.

I think they provide that information by dropping the service associated with the connection and dropping the request futures. But if you're not interested in this feature, I'll keep using my own code.

While I agree the most robust solution would be one inside hyper, this does feel like it would be valuable. I'll reopen, though I don't have concrete plans for working on this.