[ZK] Triggering validation plan returns an error for zookeeper operator
rishabh96b opened this issue · 0 comments
rishabh96b commented
Description
The validation
plan of zookeeper operator does not run properly and marked as COMPLETED
. Please find the detailed logs below.
└── zookeeper-instance (Operator-Version: "zookeeper-3.4.14-0.3.1" Active-Plan: "validation")
├── Plan deploy (serial strategy) [NOT ACTIVE]
│ ├── Phase zookeeper (parallel strategy) [NOT ACTIVE]
│ │ └── Step deploy [NOT ACTIVE]
│ └── Phase validation (serial strategy) [NOT ACTIVE]
│ ├── Step validation [NOT ACTIVE]
│ └── Step cleanup [NOT ACTIVE]
├── Plan not-allowed (serial strategy) [NOT ACTIVE]
│ └── Phase not-allowed (serial strategy) [NOT ACTIVE]
│ └── Step not-allowed [NOT ACTIVE]
└── Plan validation (serial strategy) [COMPLETE], last updated 2021-01-04 20:10:40
└── Phase connection (serial strategy) [COMPLETE]
├── Step connection [COMPLETE]
└── Step cleanup [COMPLETE]
Command
kubectl kudo plan trigger --name=validation --instance=zookeeper-instance
The kudo-controller
logs are flooded with
2021/01/04 14:20:10 HealthUtil: unknown type *v1beta1.PodDisruptionBudget is marked healthy by default
2021/01/04 14:20:10 HealthUtil: statefulset "zookeeper-instance-zookeeper" is not healthy: Waiting for 1 pods to be ready...
2021/01/04 14:20:10 TaskExecution: object default/zookeeper-instance-zookeeper is NOT healthy: statefulset "zookeeper-instance-zookeeper" is not healthy: Waiting for 1 pods to be ready...
2021/01/04 14:20:10 PlanExecution: 'deploy' step(s) (instance: default/zookeeper-instance) of the deploy.zookeeper are not ready
2021/01/04 14:20:10 InstanceController: Received Reconcile request for instance default/zookeeper-instance
The plan is supposed to trigger a job which in turn will print the zookeeper URI. But it is unable to create any job stating
HealthUtil: job "zookeeper-instance-validation" still running or failed
2021/01/04 14:20:28 TaskExecution: object default/zookeeper-instance-validation is NOT healthy: job "zookeeper-instance-validation" still running or failed
2021/01/04 14:20:28 PlanExecution: 'validation' task(s) (instance: default/zookeeper-instance) of the deploy.validation.validation are not ready
2021/01/04 14:20:28 PlanExecution: 'validation,cleanup' step(s) (instance: default/zookeeper-instance) of the deploy.validation are not ready
The zookeeper-instance
StatefulSet looks to be okay.
""2021-01-04 14:24:16,272 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@222] - Accepted socket connection from /127.0.0.1:39720
""2021-01-04 14:24:16,272 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@908] - Processing ruok command from /127.0.0.1:39720
""2021-01-04 14:24:16,273 [myid:3] - INFO [Thread-290:NIOServerCnxn@1056] - Closed socket connection for client /127.0.0.1:39720 (no session established for client)
Lastly, I am getting a TLS handshake error as well
2021/01/04 14:20:31 InstanceController: Error when updating instance status. Operation cannot be fulfilled on instances.kudo.dev "zookeeper-instance": the object has been modified; please apply your changes to the latest version and try again
2021/01/04 14:20:32 InstanceController: Received Reconcile request for instance default/zookeeper-instance
2021/01/04 14:20:32 Computing health out of 0 Deployments, 0 ReplicaSets, 1 StatefulSets, 0 DaemonSets, 3 Pods
2021/01/04 14:20:32 Updating instance default/zookeeper-instance readiness to: true
2021/01/04 14:20:32 InstanceController: Readiness did not change for default/zookeeper-instance. Not updating.
2021/01/04 14:20:32 http: TLS handshake error from 10.0.130.81:56732: EOF
2021/01/04 14:20:42 http: TLS handshake error from 10.0.130.81:56844: EOF
...
KUDO Version
KUDO Version: version.Info{GitVersion:"0.17.2", GitCommit:"d902714c", BuildDate:"2020-11-16T20:34:11Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64", KubernetesClientVersion:"v0.19.2"}
I tried this with KUDO version 0.17.0
and was getting the same error.