updating from 4.9 to `master` breaks ssl
grosser opened this issue · 10 comments
tried to use
gem 'kubeclient', git: "https://github.com/abonas/kubeclient.git"
and got:
Kubeclient::HttpError: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain)
.gem/ruby/3.0.2/bundler/gems/kubeclient-967f4a179eaa/lib/kubeclient.rb:173:in `rescue in handle_exception'
.gem/ruby/3.0.2/bundler/gems/kubeclient-967f4a179eaa/lib/kubeclient.rb:159:in `handle_exception'
.gem/ruby/3.0.2/bundler/gems/kubeclient-967f4a179eaa/lib/kubeclient.rb:466:in `create_entity'
... even this did not fix it:
ssl_options[:verify_ssl] = OpenSSL::SSL::VERIFY_NONE
this blocks testing of #525
@grosser I'm trying to reproduce now but I don't see this happening. Can you give more details?
- what ruby version?
- which auth_opitons & ssl_options options are you using? (without sensitive values of course)
- does your cluster use a self-signed cert? (though I agree VERIFY_NONE not supressing that is a bug)
- does it happen on
.discover
? on faraday actions (get/create/update/patch)? http.rb actions (watch)?
(your traceback showscreate_entity
but I'm curious what else)
most usefully, any chance you can bisect when this started? (at a guess, it's likely we messed up some options during faraday switch)
If I comment out certificate-authority-data
in a kubeconfig file (with a local k0s cluster, self-signed), I do see SSL error, but that's correct (kubectl agrees), and that error is slightly different: certificate verify failed (unable to get local issuer certificate)
.
However, what's weird in that situation context.ssl_options[:verify_ssl]
was already 0 == VERIFY_NONE (yet still getting that error!), so forcing it to VERIFY_NONE is a no-op.
if cluster_ca_data?(cluster)
cert_store = OpenSSL::X509::Store.new
populate_cert_store_from_cluster_ca_data(cluster, cert_store)
ssl_options[:verify_ssl] = OpenSSL::SSL::VERIFY_PEER
ssl_options[:cert_store] = cert_store
else
ssl_options[:verify_ssl] = OpenSSL::SSL::VERIFY_NONE
end
Umm this looks bad 😨 Absent custom CA data in config, we should NOT be setting VERIFY_NONE! We should be checking clusters with publicly valid certs against system CA roots, shouldn't we? (working on fix + proper tests => #554, #555)
this was with in-cluster config ... not during discover
(since we don't use that)
I think it's from faraday switch too ... I can debug more once I'm on a more stable internet connection.
@grosser hey, please do not release the fix with OpenSSL::SSL::VERIFY_NONE this is leading our services using kubclient into security risk. If that was already pushed can you please remove this change and apply a more secured solution?
The original report here is about TSL issues on master branch only — not in released versions — and is not debugged yet. The "even this did not fix it" statement was just observation for debugging, not intended as fix.
What I mentioned is a separate bug in config parsing that was there as long as kubeclient existed 😭 I'll open a new issue, hoping to release a fix today(?) with full analysis but needs some more work on tests... => #554, #555
can close, something on master was fixed or our cluster is no longer funky :D
welp ... still happens on watches :(
Kubeclient::HttpError: Forbidden
1
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient/watch_stream.rb" line 23 in each
2
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient.rb" line 701 in return_or_yield_to_watcher
3
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient.rb" line 401 in watch_entities
can close, something on master was fixed or our cluster is no longer funky :D
Oh yes, I forgot to link that #540 fixed this.
welp ... still happens on watches :(
Kubeclient::HttpError: Forbidden
File "/usr/bundle/ruby/3.1.0/bundler/gems/kubeclient-ae46d2d6161e/lib/kubeclient/watch_stream.rb" line 23 in each
I was confused how #537 helped here — in my testing #540 was enough, with it TLS validation worked for me both in get_* and watch_* methods.
But both Forbidden
and the line inside WatchStream#each don't look like a TLS error — sounds like the HTTPS connection was established and you got a 403 Forbidden response?
In that case it makes sense that #537 helped, if bearer token was not sent right on watches you'd get 403.
I'll reopen if I see it again 🤷