camunda-community-hub/zeebe-client-csharp

Failed to connect to all addresses

ChrisKujawa opened this issue · 1 comments

Describe the bug

It might happen if you use TLS with the Zeebe Gateway that you see errors like: failed to connect to all addresses.

This is due to expiry of the letsencrypt root certificate https://techcrunch.com/2021/09/21/lets-encrypt-root-expiry/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAFY4udX0bDWs8_-PdLKtK17SxP9RD51RacH6hg-udUA6s_mZNlxC7Kpq616I761qHZXvzEUJRftePdIrtrJJ-6Mm3PNf4QvcfG0-9RHnmfpqfBe8qIVbDGNmUsbb8WTqkK4aeSIzSxdkDyW1vy9-cKUa_rcIi4LybY1Ggly-FgXF

The grpc core lib, which is used inside the .net grpc lib, which is used by the Zeebe C# client, has a bug grpc/grpc#27532 where it doesn't choose the right certificate to communicate with an secured endpoint.

In order to verify you can set the following env vars.

export GRPC_VERBOSITY=debug
export GRPC_TRACE=tcp,http,api

This should show you something like:

Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.

In order to overcome this, either set the grpc root certificate via env variable on the client side.

GRPC_DEFAULT_SSL_ROOTS_FILE_PATH=/etc/ssl/certs/ISRG_Root_X1.pem

If you have many clients and use kubernetes you could set the preferred chain on the clusterissuer, see related comment

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: ...
spec:
  acme:
    privateKeySecretRef:
      name: ...
    server: ....
    email: ...
+    preferredChain: ISRG Root X1

See also https://cert-manager.io/docs/configuration/acme/#use-an-alternative-certificate-chain

I will keep this open until there is a bug fix release for grpc-core.

Should be fixed with #346 🤞

See grpc/grpc#27532 (comment)