cloudflare/pingora

Never send incorrect requests to upstream servers, ever.

Opened this issue · 5 comments

What is the problem your feature solves, or the need it fulfills?

HTTP/1.1 has an endless history of request smuggling attacks. See https://http1mustdie.com.

Describe the solution you'd like

Pingora should ensure that no matter what a client sends to it, requests sent to upstream servers always strictly conform to the HTTP RFCs. This means:

  • Only permitted characters in field names and values.
  • No leading or trailing whitespace in field values.
  • All other requirements from the standards are met, at least with regards to fields that can affect request framing.

Furthermore, there should be exactly one space between the fields of the request line and exactly one space after the colon in the fields. Trailiers should not be allowed to contain anything that might affect framing.

If Pingora is itself behind a reverse proxy, then Pingora needs to enforce these checks on incoming requests, too.

Describe alternatives you've considered

Potentially be vulnerable to request smuggling attacks in the future.

Additional context

This sounds like an awesome feature. For our use case, we have to support technically "incorrect" requests. We do enforce smuggling preventions (content length headers, etc). If the community is interested in "strict mode" or "rfc-enforced" validation, that would be incredible. It would just have to be something we can feature flag out of (or they can feawture flag into).

I think strict mode should be the default, as most people don't need to deal with the legacy clients and servers Cloudflare does.

You have a point there. 😂

For more context Pingora by design allows configuring itself for some deviation from the HTTP RFCs (really a lot of this was in order to be compatible with nginx behavior), and internally changing that approach is not a decision that we can make lightly as it will almost certainly result in breaking traffic we are obligated to support (and has before). Agreed that it probably makes sense to create a strict RFC-enforced mode however, and even for that to be default to encourage standard behavior.

There are some validations that need to be done in any mode I believe:

  • Content-Length (values must be valid and agree, etc)
  • Transfer-Encoding (must be chunked or missing, and if present Content-Length must be missing)
  • Connection (must not mention Transfer-Encoding, Content-Length, or Upgrade)
  • Header & trailer syntax (field names must not be empty, must not have forbidden characters, must not use line continuation as that can be abused for smuggling in the presence of bugs in other servers)
  • Trailers must not contain any of these headers.
  • If pipelining is not fully supported, the connection must be closed if there is data in the buffer right before the last request byte is sent.
  • Chunked encoding must use CRLF and must not have forbidden characters in the chunk extensions. Ideally the chunk extensions would be strictly validated.