seanmonstar/httparse

Allow spaces between header names and colons

nox opened this issue · 5 comments

nox commented

Sorry, I edited my issue, mixed up two problems at once.


curl -v "https://videoroll.net/vpaut_option_get.php?pl_id=6577"

Note the space after Access-Control-Allow-Credentials.

< HTTP/1.1 200 OK
< Server: nginx/1.16.0
< Date: Tue, 16 Mar 2021 09:03:39 GMT
< Content-Type: text/json;charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials : true
< Expires: Tue, 23 Mar 2021 09:03:39 GMT
< Cache-Control: max-age=604800

What does the spec says about this?

No whitespace is allowed between the header field-name and colon. In the past, differences in the handling of such whitespace have led to security vulnerabilities in request routing and response handling. A server MUST reject any received request message that contains whitespace between a header field-name and colon with a response code of 400 (Bad Request). A proxy MUST remove any such whitespace from a response message before forwarding the message downstream.

So first, I want to point out that the security vulnerabilities can't really happen through httparse (or at the very least, through Hyper), given Hyper does not keep around the literal text of the HTTP request, and thus those pesky spaces cannot be passed to anyone downstream AFAIK.

Second, and this is what matters to me (wearing my work hat), is that the spec itself implies that a proxy has to be able to parse those anyway to remove the headers, thus it needs to successfully be able to parse such a response. httparse (and thus Hyper) does not let any of that pass through, which is a problem for people implementing proxies.

Making a patch that makes those spaces not fail the entire parse is pretty trivial, but I wonder if we want a relaxed_response_parsing option of some sort. Such an option exists in Squid.


In the browser space, Firefox (and AFAIK Chrome too) happily ignore spaces between the header name and the colon, and will also ignore spaces in header names themselves (ignoring the whole "name: value" pair and just skipping to the next header in the response.

curl -v https://crlog-crcn.adobe.com/crcn/PvbPreference -X POST 

Note el famoso Updated Preferences: [] header in the response.

> POST /crcn/PvbPreference HTTP/1.1
> Host: crlog-crcn.adobe.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 200
< Updated Preferences: []
< Content-Length: 2
< Date: Tue, 16 Mar 2021 08:43:38 GMT
<

We can discuss whether or not to let that through at a later date.

I'm torn. The first sentence of that paragraph also clearly says that a space is no-no.

But, as long as we can keep the security and performance, I suppose it's fine to support when things are wrong.

relaxed_response_parsing option seems to be a good idea with stricter parsing as default

nox commented

I would rather not name any option relaxed_response_parsing as it doesn't tell us how it is relaxed.

@nox I agree with you! may be something which is obvious like allow_spaces_in_header_names and default being to false

nox commented

@avinassh #91 chose allow_spaces_after_header_name_in_responses.