pires/kubernetes-elasticsearch-cluster

es-master fails: Liveness probe failed: dial tcp 172.17.0.4:9300: getsockopt: connection refused

Closed this issue ยท 4 comments

Hi,

I'm having a problem launching the es-master deployment.

It keeps restarting and if I check the pod description the failure appears to be:
Liveness probe failed: dial tcp 172.17.0.4:9300: getsockopt: connection refused

After checking all the issues, open and closed (plus the documentation), I've made the following tests

  1. giving minikube more memory

  2. add

  • name: "NETWORK_HOST"
    value: "eth0:ipv4"
  1. add
  • name: "NETWORK_HOST"
    value: "eth0"

None of them seems to work and the log does not help me too much.

Additional info about the environment: minikube cluster running on RHEL7

mbert commented

I had the same problem. It seems like on a rather slow cluster the liveness probe strikes too fast.
Adding an initialDelaySeconds setting under the liveness probe in es-master.yaml and es-client.yaml helped me:

    livenessProbe:
      tcpSocket:
        port: transport
      initialDelaySeconds: 30

@mbert thank you!

Interesting, I had the same problem (trying this on GKE with a 3 n1-standard-2 cluster)

Using a 30 second initialDelay seconds caused 2 of 3 masters to work, and the third would constantly restart and inevitably enter a CrashLoopRestart cycle. (I also set NETWORK_HOST)

Setting it to 60 second delay seemed to solve the problem. I think there's some sort of master discovery algorithm that needs a bit of time to boostrap.

@tarr11 thx! It works