ManageIQ/kubeclient

updating from 4.9 to `master` breaks ssl

grosser opened this issue · 10 comments

tried to use

gem 'kubeclient', git: "https://github.com/abonas/kubeclient.git"

and got:

Kubeclient::HttpError: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain)
.gem/ruby/3.0.2/bundler/gems/kubeclient-967f4a179eaa/lib/kubeclient.rb:173:in `rescue in handle_exception'
.gem/ruby/3.0.2/bundler/gems/kubeclient-967f4a179eaa/lib/kubeclient.rb:159:in `handle_exception'
.gem/ruby/3.0.2/bundler/gems/kubeclient-967f4a179eaa/lib/kubeclient.rb:466:in `create_entity'

... even this did not fix it:

ssl_options[:verify_ssl] = OpenSSL::SSL::VERIFY_NONE

this blocks testing of #525

cben commented

@grosser I'm trying to reproduce now but I don't see this happening. Can you give more details?

  • what ruby version?
  • which auth_opitons & ssl_options options are you using? (without sensitive values of course)
  • does your cluster use a self-signed cert? (though I agree VERIFY_NONE not supressing that is a bug)
  • does it happen on .discover? on faraday actions (get/create/update/patch)? http.rb actions (watch)?
    (your traceback shows create_entity but I'm curious what else)

most usefully, any chance you can bisect when this started? (at a guess, it's likely we messed up some options during faraday switch)

cben commented

If I comment out certificate-authority-data in a kubeconfig file (with a local k0s cluster, self-signed), I do see SSL error, but that's correct (kubectl agrees), and that error is slightly different: certificate verify failed (unable to get local issuer certificate).

However, what's weird in that situation context.ssl_options[:verify_ssl] was already 0 == VERIFY_NONE (yet still getting that error!), so forcing it to VERIFY_NONE is a no-op.

      if cluster_ca_data?(cluster)
        cert_store = OpenSSL::X509::Store.new
        populate_cert_store_from_cluster_ca_data(cluster, cert_store)
        ssl_options[:verify_ssl] = OpenSSL::SSL::VERIFY_PEER
        ssl_options[:cert_store] = cert_store
      else
        ssl_options[:verify_ssl] = OpenSSL::SSL::VERIFY_NONE
      end

Umm this looks bad 😨 Absent custom CA data in config, we should NOT be setting VERIFY_NONE! We should be checking clusters with publicly valid certs against system CA roots, shouldn't we? (working on fix + proper tests => #554, #555)

this was with in-cluster config ... not during discover (since we don't use that)
I think it's from faraday switch too ... I can debug more once I'm on a more stable internet connection.

@grosser hey, please do not release the fix with OpenSSL::SSL::VERIFY_NONE this is leading our services using kubclient into security risk. If that was already pushed can you please remove this change and apply a more secured solution?

cben commented

The original report here is about TSL issues on master branch only — not in released versions — and is not debugged yet. The "even this did not fix it" statement was just observation for debugging, not intended as fix.

What I mentioned is a separate bug in config parsing that was there as long as kubeclient existed 😭 I'll open a new issue, hoping to release a fix today(?) with full analysis but needs some more work on tests... => #554, #555

can close, something on master was fixed or our cluster is no longer funky :D

welp ... still happens on watches :(

Kubeclient::HttpError: Forbidden
1
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient/watch_stream.rb" line 23 in each
2
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient.rb" line 701 in return_or_yield_to_watcher
3
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient.rb" line 401 in watch_entities

#537 fixes it :D

cben commented

can close, something on master was fixed or our cluster is no longer funky :D

Oh yes, I forgot to link that #540 fixed this.

welp ... still happens on watches :(

Kubeclient::HttpError: Forbidden
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient/watch_stream.rb" line 23 in each

I was confused how #537 helped here — in my testing #540 was enough, with it TLS validation worked for me both in get_* and watch_* methods.

But both Forbidden and the line inside WatchStream#each don't look like a TLS error — sounds like the HTTPS connection was established and you got a 403 Forbidden response?
In that case it makes sense that #537 helped, if bearer token was not sent right on watches you'd get 403.

I'll reopen if I see it again 🤷