codedge-llc/kadabra

{:tls_alert, 'handshake failure'}

Opened this issue · 4 comments

Following along from codedge-llc/pigeon#72 trying to get push notifications working in production, the simple command Kadabra.open('api.push.apple.com', :https, port: 443) or Kadabra.open('api.push.apple.com', :https, port: 2197) renders the error:

[error] {:tls_alert, 'handshake failure'}
** (EXIT from #PID<0.664.0>) shell process exited with reason: shutdown: failed to start child: :connection
    ** (EXIT) bad return value: {:error, {:tls_alert, 'handshake failure'}}

Has this come up before? The Endpoint has working https on port 443 and force_ssl: [hsts: true], but not sure if Kadabra even goes through the endpoint? Of note, this error does not come up in the local development environment.

Thanks

hpopp commented

I suspect it's probably an issue with your production environment, but it'd be impossible for me to say without knowing more about your setup. Kadabra uses erlang's :ssl module under the hood, which is what's throwing the error.

Out of curiosity, try connecting with the additional options that pigeon uses.

Kadabra.open('api.push.apple.com', :https, [
      {:cert, cert}, # or {:certfile, file_path}
      {:key, key}, # or {:keyfile, file_path}
      {:password, ''},
      {:packet, 0},
      {:reuseaddr, true},
      {:active, true},
      {:reconnect, true},
      :binary
])

Running with the extra options renders the same error. I think you must be right about it being a production environment issue, as I'm able to run the production code locally and it works fine. So, I think it's fine to close this issue, as don't have any good reason to believe it's actually an issue with Kadabra. If you have any pointers to where I might debug the underlying error or resources I could read up on, of course would be much appreciated :) The only other place where I could imagine it being something to do with pigeon/kadabra is something about the config overriding in production, but that seems unlikely, or possibly Kadabra using a different port by default than the secure. With the pigeon config being like:

config :pigeon, :apns,
  apns_default: %{
    cert: {:site, "keys/prod_cert.pem"},
    key: {:site, "keys/prod_key.pem"},
    mode: :prod
  }

On a simple digital ocean droplet running ubuntu, with letencrypt for tls, and production config like:

config :site, Site.Endpoint,
  server: true,
  secret_key_base: "${SECRET_KEY_BASE}",
  url: [host: "subdomain.site.com", port: 443],
  http: [port: 4000],
  force_ssl: [hsts: true],
  https: [port: 443,
  otp_app: :site,
  keyfile: "/etc/letsencrypt/live/subdomain.site.com/privkey.pem",
  cacertfile: "/etc/letsencrypt/live/subdomain.site.com/chain.pem",
  certfile: "/etc/letsencrypt/live/subdomain.site.com/cert.pem"],
  cache_static_manifest: "priv/static/cache_manifest.json",
  code_reloader: false,
  root: ".",
  version: Application.spec(:site, :vsn)

Hi @lasernite, any chance you are using OTP 20.3 on the production server?

I think this may be the culprit:
#ERL-599: SSL handshake failure with servers that send certificate requests when no client cert is set

I just got bitten by this as well, and downgrading Erlang to 20.2 resolved it for me.

Hopefully will be a patch release of 20.3 hitting soon to rectify this, on the off chance that you desperately need something that's in 20.3:
erlang/otp#1772

@sherbondy Thank you so much! That's such a gift. Had put this aside as I just wasn't making any progress, and that's exactly what the problem was. Working perfectly on downgrade. Thanks.