banzaicloud/koperator

Support Koperator in Openshift

cniackz opened this issue · 3 comments

Is your feature request related to a problem? Please describe.
Yes, I am trying to install Koperator in Openshift but due Security several issues I have faced. Last issue is:

$ oc logs kafka-operator-operator-8446494586-b8frg -n kafka manager
{
  "level": "info",
  "ts": "2022-04-14T01:57:10.700Z",
  "logger": "controller.KafkaCluster",
  "msg": "creating resource failed: configmaps \"kafka-kafka-jmx-exporter\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>",
  "reconciler group": "kafka.banzaicloud.io",
  "reconciler kind": "KafkaCluster",
  "name": "kafka",
  "namespace": "kafka"
}

Describe the solution you'd like to see
In short: Add documentation (Installation Steps) on how to get Koperator running on OpenShift as other companies does like Strimzi.
Bit more of story:
I am following the instructions provided at: https://banzaicloud.com/docs/supertubes/kafka-operator/install-kafka-operator/
Specifically I am not using SuperTubes but Installing Koperator and the requirements independently, when I try this in minikube no issue whatsoever but I can't find the instructions for OpenShift and then I feel I am on my own solving all sort of problems. If we compare your documentation with Strimzi, I find much easier to follow up with Strimzi.

Describe alternatives you've considered
I was able to use Strimzi and it is working fine in my OpenShift Environment, less steps and online support in general. I will try to keep using Strimzi for now but would be nice to have Banzai solution working as well.

Additional context
If I have to provide a bit more of context, the reason behind this is to have MinIO creating notifications in Kafka or Kafka saving files/logs in MinIO as the S3 solution. The Kafka Operator is more as a secondary step for me but still required so that I can help customers to setup different environments. Also notice I got to run the operator at last after many changes in the SecurityContext:

oc get pods -n kafka
NAME                                       READY   STATUS    RESTARTS   AGE
kafka-operator-operator-8446494586-b8frg   2/2     Running   10         171m
prometheus-kafka-prometheus-0              2/2     Running   1          8h

But the problem is that I still don't get the brokers and maybe related to the issue posted at the start of the conversation.

Unfortunately, we don't have openshift clusters at hand to test this, and can't tell when we'll have. Would you take a look at it and submit a PR?

@stoader @panyuenlau

I am seeing similar issue when deploying koperator on OpenShift, though the error is different. It looks all errors are something as below:

  • failed to wait for KafkaTopic caches to sync
  • failed to wait for CruiseControl caches to sync
  • failed to wait for KafkaUser caches to sync
  • failed to wait for KafkaCluster caches to sync

But I have no idea why it happens? Is this also related to RBAC?

I1010 12:15:52.366795       1 request.go:665] Waited for 1.027003792s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/events.k8s.io/v1beta1?timeout=32s
{"level":"info","ts":"2022-10-10T12:15:55.733Z","logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":"2022-10-10T12:15:55.733Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate"}
{"level":"info","ts":"2022-10-10T12:15:55.734Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2022-10-10T12:15:55.734Z","logger":"controller-runtime.webhook.webhooks","msg":"Starting webhook server"}
{"level":"info","ts":"2022-10-10T12:15:55.734Z","msg":"Starting server","path":"/metrics","kind":"metrics","addr":"[::]:8080"}
{"level":"info","ts":"2022-10-10T12:15:55.734Z","logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":"2022-10-10T12:15:55.734Z","logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","msg":"Stopping and waiting for non leader election runnables"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","msg":"Stopping and waiting for leader election runnables"}
I1010 12:15:55.835096       1 leaderelection.go:248] attempting to acquire leader lease kafka-system/controller-leader-election-helper...
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaTopic","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaTopic","source":"kind source: *v1alpha1.KafkaTopic"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaTopic","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaTopic"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaUser","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser","source":"kind source: *v1alpha1.KafkaUser"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaUser","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser","source":"kind source: *v1.CertificateSigningRequest"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaUser","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1beta1.KafkaCluster"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.Service"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.CruiseControl","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1beta1.KafkaCluster"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1beta1.PodDisruptionBudget"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.CruiseControl","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.PersistentVolumeClaim"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaTopic","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaTopic","error":"failed to wait for KafkaTopic caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:208\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:234\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/runnable_group.go:218"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.Pod"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.Service"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","logger":"controller.CruiseControl","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","error":"failed to wait for CruiseControl caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:208\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:234\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/runnable_group.go:218"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.Deployment"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaUser","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser","error":"failed to wait for KafkaUser caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:208\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:234\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/runnable_group.go:218"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.Service"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.Deployment"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","logger":"controller.KafkaCluster","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","error":"failed to wait for KafkaCluster caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:208\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:234\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/runnable_group.go:218"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","msg":"error received after stop sequence was engaged","error":"failed to wait for KafkaTopic caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/internal.go:541"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","msg":"error received after stop sequence was engaged","error":"failed to wait for CruiseControl caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/internal.go:541"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","msg":"error received after stop sequence was engaged","error":"failed to wait for KafkaUser caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/internal.go:541"}
{"level":"error","ts":"2022-10-10T12:15:55.835Z","msg":"error received after stop sequence was engaged","error":"failed to wait for KafkaCluster caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/manager/internal.go:541"}
{"level":"info","ts":"2022-10-10T12:15:55.837Z","logger":"KubeAPIWarningLogger","msg":"policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget"}
I1010 12:16:11.603731       1 leaderelection.go:258] successfully acquired lease kafka-system/controller-leader-election-helper
{"level":"info","ts":"2022-10-10T12:16:25.835Z","msg":"Stopping and waiting for caches"}
{"level":"error","ts":"2022-10-10T12:16:25.835Z","logger":"setup","msg":"problem running manager","error":"[listen tcp :443: bind: permission denied, failed waiting for all runnables to end within grace period of 30s: context deadline exceeded]","errorCauses":[{"error":"listen tcp :443: bind: permission denied"},{"error":"failed waiting for all runnables to end within grace period of 30s: context deadline exceeded"}],"stacktrace":"main.main\n\t/workspace/main.go:207\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255"}
{"level":"info","ts":"2022-10-10T12:16:25.835Z","msg":"Stopping and waiting for webhooks"}

Hello @morningspace ,

Thank you for sharing your issue.
Unfortunately as of now Koperator does not support OpenShift clusters and functionality is not guaranteed to work thus the investigation of this issue might take significant time.

We have this feature on our roadmap and are going to investigate possibilities further in the near future - coming couple months - but currently that is all we can provide.

Thank you for your patience and understanding.