jaegertracing/jaeger

[Bug]: query: TLS handshake error

Closed this issue · 5 comments

What happened?

Using v1.63.0 on Windows.
When loading https://localhost:16686/ in the browser
I get the error on the console where I run jaeger-all-in-one.exe

http: TLS handshake error from [::1]:55388: tls: first record does not look like a TLS handshake

I run with args:

--query.additional-headers="Access-Control-Allow-Origin: " --collector.otlp.enabled=true --collector.otlp.http.cors.allowed-origins="" --collector.otlp.http.cors.allowed-headers="*" --collector.otlp.http.host-port=4318 --collector.otlp.http.tls.key=%TLS%\key.pem --collector.otlp.http.tls.cert=%TLS%\cert.pem --collector.otlp.http.tls.enabled=true --query.http.tls.enabled=true --query.http.tls.key=%TLS%\key.pem --query.http.tls.cert=%TLS%\cert.pem

This works with v1.62.0.

Maybe this is related to PR #6023
I don't see an update to the CLI doc.

Steps to reproduce

  1. Run jaeger-all-in-one.exe on windows with args similar to above
  2. Open UI at default port in browser
  3. See described error in jaeger console

Expected behavior

It should work as before - the UI should load but it remains empty.

Relevant log output

http: TLS handshake error from [::1]:55388: tls: first record does not look like a TLS handshake

Screenshot

No response

Additional context

No response

Jaeger backend version

v1.63.0

SDK

No response

Pipeline

No response

Stogage backend

none (all-in-one)

Operating system

Windows

Deployment model

No response

Deployment configs

@mahadzaryab1 can you take a look?

@mahadzaryab1 I am able to reproduce, so it looks like it might be related to introducing OTEL helpers:

go run ./cmd/all-in-one \
    --query.http.tls.enabled=true \
    --query.http.tls.cert=./pkg/config/tlscfg/testdata/example-server-cert.pem \
    --query.http.tls.key=./pkg/config/tlscfg/testdata/example-server-key.pem

after the request the server logs a bunch of errors:

2024/11/21 23:27:00 http: TLS handshake error from [::1]:62373: client sent an HTTP request to an HTTPS server
2024/11/21 23:27:04 http: TLS handshake error from [::1]:62374: EOF
2024/11/21 23:27:05 http: TLS handshake error from [::1]:62375: remote error: tls: unknown certificate
2024/11/21 23:27:05 http: TLS handshake error from [::1]:62376: remote error: tls: unknown certificate
2024/11/21 23:27:13 http: TLS handshake error from [::1]:62377: remote error: tls: unknown certificate
2024/11/21 23:27:13 http: TLS handshake error from [::1]:62378: remote error: tls: unknown certificate
2024/11/21 23:27:13 http: TLS handshake error from [::1]:62379: tls: first record does not look like a TLS handshake
$ curl -v -k https://localhost:16686/
...
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://localhost:16686/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: localhost:16686]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> GET / HTTP/2
> Host: localhost:16686
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
* Closing connection
* Recv failure: Connection reset by peer
* Send failure: Broken pipe
curl: (16) Recv failure: Connection reset by peer

@mahadzaryab1 I think this is the mistake:

if s.queryOptions.HTTP.TLSSetting != nil {
err = s.httpServer.ServeTLS(s.httpConn, "", "")
} else {
err = s.httpServer.Serve(s.httpConn)
}

The listener is already upgraded to TLS by ToListener() function. If I replace it with only s.httpServer.Serve(s.httpConn) , then the connection works, but ONLY with http/1.1:

$ curl -k --http1.1 https://localhost:16686/ > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4454    0  4454    0     0   893k      0 --:--:-- --:--:-- --:--:-- 1087k

So there's still something funky going on, as by default curl tries to use http/2 and the server doesn't like it. Meanwhile, OTEL's test do appear to be testing both 2 and 1 https://github.com/open-telemetry/opentelemetry-collector/blob/00ad49af88794a8ad03d9994d290d1dea410ede2/config/confighttp/confighttp_test.go#L681

@yurishkuro Thanks for the reproduction steps. I've got a patch at #6239. I'm still unsure as to why the unit tests weren't able to catch this and will see if I can come up with a regression test for this.

Let's keep this open for a bit until we find out why the tests did not catch this.

One suspicious thing I see is here

if test.HTTPTLSEnabled && test.TLS.ClientCAFile != "" {

The HTTP/TLS tests are only run when ClientCA is present. If I remove that 2nd clause, the tests still pass - but both before and after your fix, so they are still not catching the issue. But maybe we have other similar filters.

It's also not clear to me why we are doing manual dial outs in this block

if serverOptions.HTTP.TLSSetting != nil {