ucloud/redis-operator

sentinel : Readiness probe failed: PONG

cameronbraid opened this issue · 8 comments

I have a 3 node redis cluster. all 3 redis pods are ready, one sentinel pod exists is has 'Readiness probe failed: PONG'

in the sentinel pod

redis-cli -h $(hostname) -p 26379 info sentinel

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.244.2.28:6379,slaves=1,sentinels=1

slaves=1

The readiness check fails if slaves <= 1

I cant see how slaves can be > 1 until the other sentinel pods are started. Does the statefuleset need to be changed to podManagementPolicy: Parallel ?

I think I had an incorrect understanding of what slaves here means.

Looking further in the operator logs I see this :

wait for resetore sentinel slave timeout

{"level":"info","ts":1586073304.488766,"logger":"controller_rediscluster","msg":"wait for resetore sentinel slave timeout","namespace":"harbor","name":"harbor-redis"}
{"level":"error","ts":1586073304.4967463,"logger":"controller_rediscluster","msg":"Reconcile handler","Request.Namespace":"harbor","Request.Name":"harbor-redis","error":"wait for resetore sentinel slave timeout","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/ucloud/redis-operator/pkg/controller/rediscluster.(*ReconcileRedisCluster).Reconcile\n\t/src/pkg/controller/rediscluster/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:88"}
{"level":"error","ts":1586073304.4967966,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"rediscluster-controller","request":"harbor/harbor-redis","error":"wait for resetore sentinel slave timeout","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:88"}

I am using istio - so I guess its istio causing an issue with the operator talking to the redis pods.

I'll dig a little deeper to see if I can get it to work

some more info

without istio sidecar I get this info in the master

# Replication
role:master
connected_slaves:2
slave0:ip=10.244.1.41,port=6379,state=online,offset=1033,lag=1
slave1:ip=10.244.0.44,port=6379,state=online,offset=1170,lag=1

however when istio sidecar is enabled I get 127.0.0.1 as the slave IP (ignore the offsets these are two different clusters)

# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6379,state=online,offset=18083,lag=0
slave1:ip=127.0.0.1,port=6379,state=online,offset=17946,lag=1

Is there a way to tell a redis what its IP address is explicitly ?

Looks like someone else discovered this issue

istio/istio#16078

And an FAQ entry for the istio docs about how to use a specific IP address for a slave

https://istio.io/faq/applications/#redis

Would this be hard to add to the operator ?

In short, it would require configuring the POD IP into a redis config replica-announce-ip

FYI I have a working prototype

0ddaee7

have same issue

There is no replica-announce-ip configuration in redis 3.2.2, I hope the operator supports redis 3.2.2+. @cameronbraid