banzaicloud/koperator

Not able to deploy cluster from simplekafkacluster.yaml

AlexABorisov opened this issue · 3 comments

Hi. During deployment of simplekafkacluster.yaml i have following error

Error: Internal error occurred: failed calling webhook "kafkaclusters.kafka.banzaicloud.io": failed to call webhook: Post "https://oce-system-kafka-operator-operator.oce-system.svc:443/validate-kafka-banzaicloud-io-v1beta1-kafkacluster?timeout=10s": context deadline exceeded

Curl tells no issues with webhook 200OK

koperator v0.24.1
kubernetes version

Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.8-gke.1000", GitCommit:"4ac1389d4c3eefaabbd9fa31782fcbfd72e6e6e6", GitTreeState:"clean", BuildDate:"2023-03-30T14:01:58Z", GoVersion:"go1.19.7 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}

Hi AlexABorisov,

Looks like there is a problem accessing the Kubernetes API server when sending the object and the HTTP request times out before the validation webhook would process the request.
I suspect it's most probably a network issue in the Kubernetes cluster itself and because of that, the issue is in the GKE cluster itself that needs to be resolved.

Could you check whether the node you are running Koperator one is a preemptible one and if so whether other services running on the same node are reachable without issues?

Hello pregno
No issues with node accessibility or network. Other pods are accessible.
I suspect that i may be issue with node or API server. Node is a bit oversubscribed

Waited for 1.038034464s due to client-side throttling, not priority and fairness, request: GET:https://10.207.3.1:443/apis/nodemanagement.gke.io/v1alpha1?timeout=32s

But i think this may not be related

Anyway tested from different kubernetes nodes - no network issue. Connectivity works fine.
operator respond properly with 200OK and

{"response":{"uid":"","allowed":false,"status":{"metadata":{},"message":"contentType=, expected application/json","code":400}}}

Looks like operator works. May be issue related to ValidatingWebhookConfiguration or place where it executed

not issue of koperator