whatwg/fetch

Request body streams that cannot be retried

annevk opened this issue · 63 comments

Currently our design is such that request body streams cannot be retried and that any redirects (other than 303) and HTTP authentication end up failing the request. This is rather nice for memory usage, but according to @ekr this plays badly with TLSv13 and QUIC which both assume requests can be retried.

Sigh.

cc @jakearchibald @yutakahirano @tyoshino @domenic @mnot

Can't the network stack use Request.clone() or Request body stream tee() if they are in a re-playable scenario?

Edit: From an implementation side of things we are already doing this in gecko. For example, hitting back and providing the option to re-post data requires this.

@wanderview part of how we specified this feature is to allow the network stack to not do that in order to optimize resource allocation. I know it's generally done, but specifically for fetch() with streams we wanted it to not do that to preserve memory. It seems however we might need to do it anyway to play nice with TLS and QUIC at which point we might as well start following redirects and work with HTTP authentication too.

I'd like to hear more about this TLS and QUIC feature, as it seems strange to me that such protocols would require replaying the entire body.

ekr commented

Well, it's just a general property of networking protocols that you can have data in flight and then the other side sends you some message that causes the connection to fail and you have to retry. 0-RTT is just the most recent case where that crops up.

Is there any (fixed) upper bound for the amount of the in flight data?

Given https://groups.google.com/d/msg/mozilla.dev.tech.network/glqron0mRko/QMBjliPVAQAJ and reply it seems that indeed this can be constrained by the implementation. So for a 10GiB request body the implementation could just use back pressure after 100KiB or so to make sure it succeeds and otherwise retry with those 100KiB.

It might be worth adding a note about this. The re-post-on-back scenario from @wanderview can't be hit by this feature I think since that's just <form>.

ekr commented

Even ignoring 0-RTT, there's not really any limit on retry.

Consider the case where you are doing a huge POST. The amount of data you have in flight at any given time is dictated by the usual transport mechanisms, but the bottom line is that the server can send you a TCP RST half-way through, and at that point the HTTP stack can either retry or not, but you're basically in a posture where you have a network error and you need to do something.

Sure, but in that case not retrying seems like a reasonable default for the streams API. That is to say, to make it an application concern (application being the web site).

ekr commented

I think you will find that that causes a surprising level of failures.

@mcmanus may have some thoughts here.

Does this mean that protocols layered over QUIC need to be prepared for retransmission as well as TCP is, and HTTP is now layered on top of QUIC with preparation for such necessary retries in the granularity of HTTP transactions (or a little smaller units), not bytes/packets like TCP?

Just making sure, is the current spec saying that if a service worker does e.respondWith(Response.redirect(...)), the body in the redirected request is null? It looks like that's Chrome's current behavior, and it's quite convenient as it means we don't need to store the request body somewhere before sending it to the service worker.

@wanderview not sure if Firefox has similar behavior, or if we have any WPT tests about this.

@mattto that's not intentional if it actually says that (and worthy of a separate issue). That would mean that using a service worker instead of a server changes the behavior. While we've accepted that for some cases, we have not for this one.

@annevk What if the service worker reads event.request.body and then does a redirect? Is the redirected request expected to have the same body?

@mattto I guess I'd be okay with that being different. We haven't defined this and thought through it in sufficient detail. See also #609 by the way about additional actions to be taken when you drop the body during a redirect, though it's unclear if that should be applicable here.

cc: @yoichio

Sorry for the long silence.

I think you will find that that causes a surprising level of failures.
@mcmanus may have some thoughts here.

Is this issue addressed by providing error details for web developers so that they can retry?

Per @ddragana letting applications handle the error should be okay. We still need some language for 0RTT though as per #538 (comment). It's a bit unfortunate there's no testing infrastructure for that as that's probably most important to get this right.

cc: @yoichio

Sorry for the long silence.

I think you will find that that causes a surprising level of failures.
@mcmanus may have some thoughts here.

Is this issue addressed by providing error details for web developers so that they can retry?

If a error is return developers can decide to retry it or not.
Do we want to automatically retry request in the http stack without surfacing an error, I would say no. Some requests like POST are not safe to retry by definition. (Firefox will not send POST request during 0-rtt)(although http stacks already do retry request automatically in some cases due to network error)

We need to spec like "For 0RTT, we should hold send buffer until server responds it arrived", right?

Yeah, and also apply back pressure after a given amount of KiB has been buffered so the buffer doesn't have to be too big. Also, we probably need to dig up the formal terminology and reference the relevant RFCs.

cc @whatwg/http

Bodies already have to be replayed on certain redirects, as comment #1 notes. Chrome also replays bodies on certain errors when sending a POST on a reused socket (In the case of errors that could mean the server timed out the socket racily with our decision to reuse of the socket) - I had assumed other browsers do that as well, or only use fresh sockets with POSTs, which doesn't seem to work with H2.

Chrome's blob code allows upload bodies to be kept as on-disk objects and streamed as uploads from there. I'm not sure if this applies to all web-initiated uploads, just the ones that explicitly create and upload a blob, or any upload with a sufficiently large body.

@MattMenke2 to give some context. We can only upload "blobs" today and those can easily be replayed. For uploading streams we want a design that's minimal in terms of memory usage. So if you upload 1TiB, it shouldn't also try to buffer 1TiB. Therefore, as OP notes, the plan is to fail on redirects and various other known cases that require a replay. The question is if such replay can always be avoided, reduced to only a fixed 100KiB buffer or so, or result in error, or if uploading streams without also buffering everything is doomed to fail for one reason or another.

Thanks for the context! I didn't realize that this was for a streaming upload API, since the term "stream" is rather ambiguous - upload bodies are always streamed to the server over a socket.

The only chunked uploads Chrome supports are internal requests for its speech-to-text API, which does currently buffer the entire upload body for replay, but that certainly doesn't seem like a behavior we'd want exposed to the web.

Or given that Firefox doesn't send POST request during 0-RTT, if Chromium does same(@MattMenke2 ?), how about specifying that so we can avoid to retry ReadableStream for 0-RTT?

mnot commented

WRT not sending POST during 0RTT, see RFC8470 - Using Early Data in HTTP:

Absent other information, clients MAY send requests with safe HTTP methods ([RFC7231], Section 4.2.1) in early data when it is available and MUST NOT send unsafe methods (or methods whose safety is not known) in early data.

If there are safe methods that accept a request body (is SEARCH a thing yet?) I’m not sure that helps, unless we’d further constrain the RFC to only consider HEAD and GET as okay for 0RTT.

@yoichio: Indeed, Chrome won't send POSTs during 0-RTT.

mnot commented

SEARCH keeps on getting some amount of interest; there are a few people trying to bring it back up now.

Because SEARCH method is used to ask a query in server, it is hard to imagine to use Stream body to construct such a query, or rarely use case.

As 0-RTT is a tech to get a response ASAP, sending long request body during that doesn't make sense with any method.

We can state that user agents should not send the request body opportunistically when the body is made from a ReadableStream, in the fetch spec.

So 0RTT would depend on the request body type and request method? (I'd somewhat prefer we'd only do 0RTT for GET/HEAD, which are guaranteed not to have a request body.)

Actually that is similar to what Chromium is doing (@MattMenke2, please correct me if I'm wrong).

bool HttpUtil::IsMethodSafe(base::StringPiece method) {
  return method == "GET" || method == "HEAD" || method == "OPTIONS" ||
         method == "TRACE";
}
...

int HttpNetworkTransaction::Start(const HttpRequestInfo* request_info,
                                  CompletionOnceCallback callback,
                                  const NetLogWithSource& net_log) {
  ...
  if (HttpUtil::IsMethodSafe(request_info->method)) {
    can_send_early_data_ = true;
  }
  ...
}

Correct, we use a whitelist of safe methods (as defined by https://tools.ietf.org/html/rfc7231#section-4.2.1) for 0RTT, rather than a blacklist. That's used for both 0RTT over TLS and 0RTT over QUIC.

So removing OPTIONS and TRACE from IsMethodSafe, Chromium does 0RTT only for GET/HEAD:

bool HttpUtil::IsMethodSafe(base::StringPiece method) {
  return method == "GET" || method == "HEAD";
}

Should we spec it only for fetch() or widely HTTP? I prefer only fetch() if we can easily distinguish it in HttpNetworkTransaction so far because of discussion/group scope(here!) instead of larger HTTP WG.

Chromium's network stack is currently unaware of how a request was initiated - it's mostly just a "basic" HTTP implementation, with fetch bits added only where necessary (Or where significantly more convenient than the alternative, at least).

mnot commented

For some clients / use cases, it's OK to send a body on a request that might be redirected / error'd / etc., because the client has the body available to resend. The requirement that's driving you to this design is specific to the use case for this API.

Worth noting that Chromium's network process has a streaming upload API where, on error, the network stack asks the upload to be replayed. If we don't want to replay the upload in this particular case, I assume the renderer could be hooked up to this API.

I also think we likely want to allow replaying the data if an error is received soon, since this is an inevitable thing that can happen if a browser reuses sockets, and a server times old sockets out.

I also think we likely want to allow replaying the data if an error is received soon, since this is an inevitable thing that can happen if a browser reuses sockets, and a server times old sockets out.

I think this is specific to network stack, it desides to send data on reused connection and it may happen that connection has been closed already and therefore data needs to replayed. (it is somewhat network optimization with a risk that data needs to be replayed). I think API should not need to worry about this. Maybe just a comment in the spec that this may happen is enough. Also 0-rtt replays are specifics of the networking protocols and api should not worry about it.

I agree that noting an UA might cache (partially) read data from request body if needed is enough.

So far there is two opinions there, right ?

  1. Cache (partially) read data from request Streaming if needed in any network stack.

  2. Retry Streaming request must fail with error while showing message for web author.

I considered again and 2 is good for Chromium impl and web author because it is simple and memory-efficient.

Given @MattMenke2 and @ddragana's comments I suspect we need 1. It seems like we can avoid the 0RTT issue by only allowing that for GET/HEAD, but avoiding socket reuse and such seems harder?

Anyway, if there's a potential for some replay, I would very much like Fetch to include a strong recommendation to the upper limit for the network stack replay cache, e.g., at most X KiB. I want to avoid a race to the bottom where at some point the entire stream needs to be buffered because an implementation happened to do so and now servers break if your implementation does not. Part of the point of this feature is memory efficiency and the standard should ensure that is actually the case in practice.

you really ought to make OPTIONS 0-rtt safe.. its specifically called out by https://tools.ietf.org/html/rfc7231#section-4.2.1 as safe to replay, is commonly the first cxhg on a connection due to cors preflights, and as a practical matter doesn't have a body that is hard to buffer.

@mcmanus that's a good point, though the logic would end up a bit more complicated as OPTIONS with a request body shouldn't be 0RTT safe I think. So at that point 0RTT-safe would become a property of the request, not the request method.

its safe with the body..

Of the request methods defined by this specification, the GET, HEAD,
OPTIONS, and TRACE methods are defined to be safe.

Sure, but for the purposes of upload streams we're then back to the same questions we were at with POST, in that we don't want to support replays of arbitrary lengths. And it doesn't seem great either to make distinctions based on the type of request body.

I get it - but its an important use case given how preflight works and the fact that OPTIONS almost never actually has a body.. it would seem like an own-goal to give up 0-RTT for concern over the extreme corner case that presented a problem.

its been a while since I've read the streams spec, but surely there is some form of backpressure in it? (blocking, or flow control credits, or ???..) And surely there is implicitly some form of state requirement for buffering already needed given that you are willing to replay the headers.. and compared to any OPTIONS body I've ever seen (which are rare to start with) the upper end of the header size is more than enough for the OPTIONS body.

can't you just buffer a K or 2 and then let backpressure kick-in pending handshake notificiation?

If it would be long, should we split the 0-RTT safeness issue (which actually is not directly related to Stream so fork it to new thread?, I think), and limiting retry buffer for stream (continue here) ?

@mcmanus yeah, something like that is what I suggested in #538 (comment) (with perhaps too big a budget) and then this whole thread happened. It might still be reasonable though and I wouldn't mind it.

@yoichio isn't it retrying for early connection drops and retrying for 0RTT fundamentally the same issue?

@annevk If retrying here means retry if early connection drops on 0-RTT and buffering Stream is only needed for that, it is the same issue. and If the early connection request should not contain Stream body, the buffering is no longer needed. Right?

Since It's been a while, let me summarize the discussion so far:

Redirect

fetch() fails with promise-error like "Redirect was asked but Stream can be replayed"

Other cases need a stream replayed

U.A. might have a cache of the streams up to a few KiBytes from implementation restriction and for 0-RTT resend.

Redirects and server-initiated authentication prompts. And the rejection is with the normal TypeError with perhaps a console-only message of the reason (which is "cannot be replayed" btw). We should not expose the failure reason to web content.

And with respect to the cache I think that's a must and ideally we specify an upper bound on the budget so it doesn't become a race to the bottom.

Redirects and server-initiated authentication prompts. And the rejection is with the normal TypeError with perhaps a console-only message of the reason (which is "cannot be replayed" btw). We should not expose the failure reason to web content.

I see. BTW, generally what makes a different between showing on console or exposing API for a "error"?

And with respect to the cache I think that's a must and ideally we specify an upper bound on the budget so it doesn't become a race to the bottom.

"race to the bottom" means just caching a whole stream that makes Streams API memory-inefficient?
If so, I agree specifying an upper bound.
@MattMenke2 do you know how much KiBytes we need in Chrome to replay a stream on some inevitable situation like 0RTT, socket time out?

BTW, generally what makes a different between showing on console or exposing API for a "error"?

Generally we don't let sites learn about network errors so they do not get information about the user's environment or servers. Putting this in the console is harmless (although with Spectre...). Some of these security decisions have been upended by various Performance WG efforts and I haven't really tried to sort those out (and unfortunately neither have they), but until someone has done the work to define the contract better about what is okay to expose I rather hold the line.

"race to the bottom" means just caching a whole stream that makes Streams API memory-inefficient?

Right, a big reason for offering this is allowing minimal memory usage.

@MattMenke2 do you know how much KiBytes we need in Chrome to replay a stream on some inevitable situation like 0RTT, socket time out?

I don't know that, unfortunately. We've always cached the entire thing in Chrome, which means we can handle redirects that require we re-send the body, though I have no idea if our single consumer even needs that.

It does need that (and that's specified in Fetch as well).

We've always cached the entire thing in Chrome, which means we can handle redirects that require we re-send the body,

Chrome needs to change that part at least on the decision that U.A. should fail to resend a ReadableStream on HTTP 30X response.

Wait. 0RTT also needs server HTTP response to resend the request when the first authentication was failed?
AFAIK, Chrome doesn't process server response before sending entire request body (right? @MattMenke2 )
Then, resending request on 0RTT failed needs entire body copy like redirect in Chromium.

Correct, Chromium only starts reading the response once the entire body has been sent.

HTTP streaming has become so useful that the word "streaming" has became a house hold word. Transmissing audio or audio/video live, or radio shows, or streaming a TV tuner signal to a playback device.

The HTTP semantics of a single back/forward function call are very useful and clearly superior to other out of band protocols like FTP or SIP/RDP. FTP is dead mainly for that reason. And RDP is just as problematic: If you've ever tried to use a SIP softphone, you know what I'm talking about.

What we need is essentially the same what we call HTTP streaming, but as PUSH instead of PULL: The source of information initiates the connection, not the target. Still, it's streaming live.

Imagine a one-direction audio call, initiated by the sender, transmiting to a specific target. Let's say a baby phone, where the sender opens the connection as soon as it detected activity, and sends it to a specific target URL. But the stream is live and ongoing.

So, a retry is inherently impossible, given that it's live data.

We chromium are discussing a partial cache mechanism here.

If I'm reading the spec correctly, we return an error if the body is a stream (it has no source) and we get a redirect that isn't 303 https://fetch.spec.whatwg.org/#concept-http-redirect-fetch (step 9).

Should this allow 301/2 responses if the method is POST?

That seems best tracked in a separate issue.

I wanted to leave a comment here as quite a bit of time has passed. There's a PR up now at #1204 which implements #538 (comment). In particular, for request streams (i.e., streams API is being used) we allow a 64 kibibyte buffer for replays in the networking stack. Once that buffer is exceeded, the only recourse to having to resend is to terminate with a network error.

We're not saying anything specific about 0RTT and I don't think we need to as @ddragana said above.