dlundquist/sniproxy

http2 clients being proxied to wrong host.

cfarence opened this issue ยท 19 comments

I've been running sniproxy for about a year now, and haven't had any issues. Until recently one of my hosts going through sniproxy seems to be taking over. For some reason the gitlab server is being presented on hosts that it shouldn't be. For example I have gitlab installed git.example.com and a kanban on kanban.example.com, using oauth you are directed to gitlab for login once it redirects you back to kanban.example.com you are no longer presented the kanban server, your presented the gitlab server. I have multiple services running under example.com each on their own virtual machine, logging in to gitlab is enough set it off, causing any sub domain under example.com to lead to gitlab. I have other domains going through sniproxy and those are unaffected.

Everything uses SSL and is using the same wildcard cert. I've had the same setup with for a while, and only recently I upgraded to the latest gitlab and that's when it started. I'm using chrome and have looked at the http requests its making, specifically looking at the headers and cookies and I didn't see anything out of the normal. I was able to replicate this behavior on firefox but not internet explorer.

Clearing the browser cache and cookies will cause the other hosts to work fine. I can also restart sniproxy and will also cause the other sub domains to work fine. I'm not sure what gitlab is doing to make sniproxy act this way. I confirmed that sniproxy is routing the request to the wrong backend server.

I'm hoping maybe one of you guys know more about the topic, and maybe can help me find out what is going on.

Have you correlating the access logs of sniproxy and the backend applications? It is possible the the client sees both sites are using the same key pair and pipelines requests for both sites in a single TLS session. I have not seen this behavior before, and explicitly tested for it when I first wrote sniproxy, but this behavior may have changed since over the last few years. Keep in mind sniproxy logs each TLS connection while webservers log each HTTP request, with pipelining enabled this will be a one to many correspondence rather than one to one.

I would also check that you are using anchored regular expressions in your sniproxy configuration e.g. ^kanban\.example\.com$ rather than kanban.example.com so you don't have any unexpected matching behavior.

I changed the expressions to use ^kanban.example.com$ structure with no success

I used the multicraft's server access log since its apache and easy to find. I looked at sniproxy's access log and it says the request should have made it to the correct server. Looking at the access log for gitlab I see the request being made for multicraft.example.com even though it shouldn't have made it to that server. Looking at the apache log for multicraft the request never made it to the server.

Sniproxy says the request should be going to the right server, though the correct server says it never got it. Though the gitlab server says it got the request, even shows the hostname in the access log.

It is possible the the client sees both sites are using the same key pair and pipelines requests for both sites in a single TLS session.

This is also occurring for me as a recent issue, and may be purely client side (Chrome 46)

I have two separate systems behind SNIproxy, that share the same wildcard certificate, and have similar security settings, however respond for separate domains.

@Stealthii
That may be it, though I'm using the same wildcard certificate across almost every host that SNIproxy proxies and didn't have any issue. Until I upgraded to a new version of Gitlab and then all of a sudden other hosts where being proxied to the Gitlab server.

I haven't found a solution yet, as a stop gab i moved my Gitlab server to its own IP address so it wouldn't have to be behind SNIproxy. I still have all the other hosts behind SNIproxy using the same wildcard certificate and haven't run into the issue.

I'm assuming it's something the new versions Gitlab is adding to the request that causes it to happen. I haven't seen the issue with other services I host.

I'm looking into the configuration now to see what causes this. It's happening with Atlassian Jira for me.

As far as the client (Chrome/Firefox) is concerned, not only is it the same IP address, but it's the same server (SNIproxy), so this behaviour would make sense. I think the ball's in this side of the court (either something SNIproxy can/should handle, or a misconfiguration of a service behind SNIproxy)

This all runs on one dedicated server with proxmox installed and every service in its own container. It has a virtual network connecting the containers together. SNIproxy runs in one container and forwards requests to services over the virtual network connecting them.

It was running fine for months with Gitlab and everything else, then upgraded Gitlab and it all broke. I have several extra IPs from my provider so I just used that for Gitlab and all was normal again.

I did look at the http requests in chrome and on the servers logs and didn't see anything out of the ordinary. Gitlab's nginx logs still showed the hostname it was trying to reach. It just got proxied to the wrong server.

The HTTP/2.0 spec allows to re-use a TCP session for different hostnames when the IP is the same and the certificate is valid for both hostnames. So this is not a sniproxy issue but a HTTP/2.0 issue.

The solution is to disable HTTP/2.0 server-side (this should instruct the browser that it should not re-use the TCP session) or use different certificates for different backends (in such a way that the certificate for backend A is not valid for backend B and vice versa).

This should probably be documented. :-)

jfcoz commented

I've got a similar problem, also with a wildcard certificate and gitlab. It only happens with content of different subdomain of the same wildcard certificate.

In gitlab documentation :

By default, when you specify that your Gitlab instance should be reachable through HTTPS by specifying external_url "https://gitlab.example.com", http2 protocol is also enabled.

Maybe this problem can be reduce with a smaller keep alive timeout on the gitlab instance.

I think I will keep sniproxy to redirect all my connections by certificate, and then for each wildcard certificate use a nginx or apache to unencrypt traffic and send it to the correct backend based on the HTTP Host header.

SNIproxy makes a single forwarding decision for each incoming TCP session. This allows it proxy TLS without access to TLS private keys. If you need to proxy to multiple different backends using the same certificate, you would probably be better served my a load balancer/proxy which performs TLS termination such as nginx or haproxy.

The original usecase of multiplexing a single IPv4 address for multiple IPv6 only servers under different administrative control. By not preforming TLS termination, it allows each IPv6 only server administrator to maintain control of their TLS private key.

Since clients are now exhibiting this behavior (I tested for it in 2011), this issue probably warrants mention in documentation.

The HTTP/2 spec defines a new status code 421 which the backend can use to indicate that the client should retry the request using a new connection. So if you are routing using SNI, your backends supports HTTP/2 and it gets an incoming request which it should not get, it should respond with the 421 status code.

From the HTTP/2 spec:

In some deployments, reusing a connection for multiple origins can result in requests being directed to the wrong origin server. For example, TLS termination might be performed by a middlebox that uses the TLS Server Name Indication (SNI) [TLS-EXT] extension to select an origin server. This means that it is possible for clients to send confidential information to servers that might not be the intended target for the request, even though the server is otherwise authoritative.

A server that does not wish clients to reuse connections can indicate that it is not authoritative for a request by sending a 421 (Misdirected Request) status code in response to the request (see Section 9.1.2).

https://http2.github.io/http2-spec/#rfc.section.9.1.1

@tellnes Thanks for the info. Unfortunately, since SNIproxy doesn't have the TLS keys it can not follow the protocol (http2 in this case) taking place within the TLS session.

@dlundquist Yes, It is the backend (which SNIproxy forwards to) which says it supports HTTP/2 (by adding h2 ALPN extension) which also should also support the 421 status code. But I think as HTTP/2 gets more popular, maybe that should be added to the Readme to help others.

The problem you ran into is caused by "Connection coalescing".
For example, if a browser (Chrome/Firefox) connects to a http2 server "a.example.org" which resolves to 192.168.2.1 and later connects to "b.example.org", it checks 2 things:

  • Do DNS entries match? This will always be the case since our proxy only has one IP.
  • Has a.example.org provided a certificate that also matches b.example.org. (e.g. wildcard or multidomain cert).

If both are the case it will not initiate a second connection but instead reuse the existing connection to host a.example.org.
This is fine in theory but due to the use of sniproxy a.example.org is unable to handle b.example.org because they are in reality, two different hosts.
One way the solve this should be the use of different certs that only cover a single domain (e.g. no wildcard).
I have not tested it but I think it might work.

#268 made me wonder, if we modify the buffer to drop the http2 flag, that would probably fix the problem of connection reuse, since connection reuse is only allowed in http2. This would of course be a configuration flag and not the default.

I just came across this issue thanks to the footnotes here. It explains weird behavior I've seen on proxied sites.

The linked footnote suggests using a separate IP for each domain which needs to be kept apart (which is actually quite feasible with IPv6). That seems to make firefox happy, but chromium still stumbles in places, and I'm not sure why. In the meantime this works quite well as a total hack: chromium-browser --disable-http2

fdxx commented

So, is there a solution to this problem?

I want to use HTTP2 with wildcard cert

So, is there a solution to this problem?

I want to use HTTP2 with wildcard cert

Well you got to choose two:

  • wildcard cert
  • Http2
  • single IP

The only other option is to use a full proxy instead of sniproxy which effectively terminates http2 and creates a new http2 connection to the target host, proxying frames correctly as it knows that they are in fact two different hosts.

dejl commented

Is there any way to disable http2 without intercepting the tls stream?
Is the only solution different IPs for each hostname?

Is there any way to disable http2 without intercepting the tls stream?

HTTP2 is announced via alpn or npn in tls and can be announced in http headers.

If you control the upstream server you should be able to have it not announce h2 support. The only other solution is not using a wildcard certificate. Http2 multiplexes only if the domain resolves to the same ip and if the certificate presented on the original connection also verifies for the new connection. So to avoid multiplexing you either need to present a cert that only verifies a single host or you need multiple IP's.