libressl/portable

Unexpected "unknown pkey type" errors during TLSv1.3 handshakes on server with multiple certificates

mdounin opened this issue · 1 comments

When testing freenginx with LibreSSL (last checked with LibreSSL 3.9.2), I observe the following errors being reported in tests which use multiple certificates (ECDSA and RSA) and try to enforce usage of the particular certificate by using limited set of supported signature algorithms:

2024/05/12 02:20:57 [debug] 509#0: *2 SSL_do_handshake: -1
2024/05/12 02:20:57 [debug] 509#0: *2 SSL_get_error: 1
2024/05/12 02:20:57 [crit] 509#0: *2 SSL_do_handshake() failed (SSL: error:1402D0FB:SSL routines:ACCEPT_SW_CERT:unknown pkey type) while SSL handshaking, client: 127.0.0.1, server: 127.0.0.1:8443
2024/05/12 02:12:08 [debug] 338#0: *2 SSL_do_handshake: 1
2024/05/12 02:12:08 [debug] 338#0: *2 SSL: TLSv1.3, cipher: "TLS_AES_256_GCM_SHA384 TLSv1.3 Kx=TLSv1.3 Au=TLSv1.3 Enc=AESGCM(256) Mac=AEAD"
2024/05/12 02:12:08 [debug] 338#0: *2 reusable connection: 1
2024/05/12 02:12:08 [debug] 338#0: *2 http wait request handler
2024/05/12 02:12:08 [debug] 338#0: *2 posix_memalign: F6A9E5B0:256 @16
2024/05/12 02:12:08 [debug] 338#0: *2 malloc: F6A19930:1024
2024/05/12 02:12:08 [alert] 338#0: *2 ignoring stale global SSL error (SSL: error:1402D0FB:SSL routines:ACCEPT_SW_CERT:unknown pkey type) while waiting for request, client: 127.0.0.1, server: 127.0.0.1:8443
2024/05/12 02:12:08 [debug] 338#0: *2 SSL_read: 32

Note crit and alert log messages: these mention "unknown pkey type", which should never happen. Further, the alert one appears from nowhere: SSL_do_handshake() just returned success, but checking the error queue before the SSL_read() call reveals there is an error in error queue.

The root cause seems to be that ssl_sigalg_select() puts an error into the error queue if no matching signature algorithm found:

	SSLerror(s, SSL_R_UNKNOWN_PKEY_TYPE);
	return NULL;

but it is used by TLSv1.3 code to check if a particular certificate can be used in tls13_server_check_certificate(), which is in turn called for ECDSA and RSA certificates by tls13_server_select_certificate():

	cpk = &s->cert->pkeys[SSL_PKEY_ECC];
	if (!tls13_server_check_certificate(ctx, cpk, &cert_ok, &sigalg))
		return 0;
	if (cert_ok)
		goto done;

	cpk = &s->cert->pkeys[SSL_PKEY_RSA];
	if (!tls13_server_check_certificate(ctx, cpk, &cert_ok, &sigalg))
		return 0;
	if (cert_ok)
		goto done;

As such, as long as ECDSA certificate cannot be used because the client does not support corresponding signature algorithms, the RSA certificate is used, but the "unknown pkey type" error is left in the error queue. This in turn causes two errors mentioned above: if SSL_do_handshake() blocks, the error is reported by SSL_get_error() instead of SSL_ERROR_WANT_READ, which causes handshake failure; if SSL_do_handshake() completes without blocking, the error is left hanging in the error queue, and reported as a stale error later.

A band-aid which detects and drops SSL_R_UNKNOWN_PKEY_TYPE errors from the error queue after SSL_do_handshake() seems to resolve this, confirming the above. It would be great to fix this.

This is an ugly and rather annoying bug. Thanks for the detailed analysis and the clear writeup. We'll look into fixing this for the next major release.