banzaicloud/koperator

TLS termination at Ingress (Envoy) level

dobrerazvan opened this issue · 9 comments

Is your feature request related to a problem? Please describe.
Yes, there is no SSL termination on Ingress (envoy).

Describe the solution you'd like to see
We need a way to terminate ssl at Ingress (envoy). This includes:

  • change from port based multiplexing in envoy, to name based multiplexing
  • make envoy terminate ssl connections by adding TLSContext to envoy config

Describe alternatives you've considered
We considered using the already existing support for TLS in koperator, but in our deployments we surfaced some issues:

  • certificates rotation requires a rolling restart of the kafka cluster (more expensive than an envoy deployment restart)
  • during the transition to SSL, current setup requires 2 ports per broker (one plaintext and one ssl) and with large deployments cloud provider limits are reached

Additional context
Envoy config map sample -> https://gist.github.com/dobrerazvan/aa9ec700b165817319a464368abf2cf3
Kafka config map sample -> https://gist.github.com/dobrerazvan/6c3553425bac4b71149b9ddbb8142764

Adobe will provide this feature.

@dobrerazvan if you terminate TLS at Ingress level that client certificates (presented by the client app) won't be passed to brokers which breaks client certificate based authentication. How can you ensure the the client certificates reach the brokers?

@stoader SSL client authentication can be done with envoy, if needed. We are interested in a security solution for kafka, where local environment for kafka deployment is already secured and just outside data comes via unsecured channels.

@dobrerazvan kafka acls mechanism reads the subject field of the client certificate to identify the client application in order to determine what topics it has access to. This is why the client certificate presented by the client application has to reach the broker as the broker reads the content of the certificate.

Currently we do have the need to use the built-in ssl based authentication. We will implement a separate OAUTH based plugin in kafka that will take care of authentication. We are only interested in a way to bring traffic into the cluster in a secure way.

Will you make the TLS termination at Ingress level configurable (like a passthrough mode etc) such that those users who rely on client certificate based authentication and authorization can still leverage this Kafka feature?

Yes, what we want to develop will be fully backward compatibile with all features currently available in operator. We will add some config bits that, when active, will enable ssl termination on envoy, else envoy will do its tcp proxy work as before and pass through all the certs info to brokers.

@dobrerazvan You can rotate certificates without rolling updates with kafka dynamic configs.
https://docs.confluent.io/platform/current/kafka/dynamic-config.html

@dobrerazvan You can rotate certificates without rolling updates with kafka dynamic configs. https://docs.confluent.io/platform/current/kafka/dynamic-config.html

This is interesting, we didn't consider looking into this functionality but might have solved the problem with existing functionality. One downside I can see is the need to double the number of ports (nodeports) required to run 2 listeners in parallel, one plaintext and another one sasl_ssl - on a 30 broker cluster one would need 60 nodeports which is troublesome on shared infrastructure due to AWS/Azure documented limitations.

@dobrerazvan I found that koperator doesn't set password.encoder.secret in the broker config and it can not be added dynamically. It is necessary for listener's certificate rotation. This problem can be solved in a small PR.
If it is solved then you can exec bash in a broker pod, then you need to unset KAFKA_OPTS env variable. When it is done you can play with dynamic settings. (e.g. : /opt/kafka/bin/kafka-configs.sh --bootstrap-server kafka-0.kafka.svc.cluster.local:29095 --entity-type brokers --entity-name 0 --alter --add-config listener.name.controller.ssl.keystore.location=/var/run/secrets/java.io/keystores/server/controller/keystore.jks,listener.name.controller.ssl.keystore.password=12345) This dynamic setting is "per-broker" grade so it needs to be set for every broker (use --entity-name $brokerID).
Of course, it wouldn't be the best practice. I would go with a small application which would replace the content of the listener's secret with new certificate so the new cert will be loaded into the brokers automatically and after set listener.name.controller.ssl.keystore.location and password dynamically through a kafka client library.