CompressionLayer breaks range requests, violating RFC 7233
and-reas-se opened this issue · 6 comments
Bug Report
Version
tower-http v0.4.3
Platform
Linux 6.5.2 x86_64
Description
Say you have a large file served by a ServeDir and with a CompressionLayer thrown on top. ServeDir supports range requests. If the client requests to get a file starting at from bytes x to bytes y, and the response is compressed, this request should be interpreted as compressed bytes from x to y. But these two in combination will interpret it as uncompressed bytes x to y, so the wrong set of bytes are returned.
Furthermore the content-range header is set incorrectly in the response. It's also based on uncompressed bytes rather than compressed.
I ran in to this problem in a real life scenario. It intermittently breaks Microsofts Azure CDN (Front Door) which will sometimes use range requests when requesting files from the origin.
The workaround that solved the problem for me was to write a small axum middleware that strips the range header from requests and accept-ranges header from the response (reproduced below).
use axum::{http::Request, middleware::Next, response::Response};
pub async fn remove_range<B: std::fmt::Debug>(mut req: Request<B>, next: Next<B>) -> Response {
req.headers_mut().remove("range");
let mut response = next.run(req).await;
response.headers_mut().remove("accept-ranges");
response
}
Right so the (de)compression middlewares need to either strip, or if possible adjust these headers. Makes sense.
Actually maybe the better solution is for the middlewares to disable themselves on such requests. Otherwise requesting the end of a huge file could lead to a full re-transmission for no reason.
Given David's approval of my previous comment, a PR implementing it would be welcome.
What if a client starts downloading a large file using a non-range request, the connection drops partway trough, and then the client tries to resume the download using a range request? Wouldn't you get a corrupted file if the compression middlewares are enabled for the first request and disabled for the second?
Hm, I see what you mean. Really the whole range request for compressed data thing critically relies on the server having the full compressed content cached. That is annoying. Maybe the only solution is really what you wrote then. But there's another problem, if ServeDir
supported compression itself, wrapping it in a compression layer should be a no-op, but as written will also filter out those headers 😕
I doubt this is the first time this has been figured out. Perhaps other server frameworks in other languages could show what to do.