Make use of the address info in MOVED error
Closed this issue · 2 comments
During slot migration in a large cluster (or due to a failover), when the slot ownership changes, not every node in the cluster will learn about this change instantly - especially when the cluster has 10s-100s of nodes or when the other nodes are busy running commands. So, when we get a MOVED error, it is better to
- Send a topology refresh request to the node that responded with the MOVED error as it has learnt about the topology change.
- Resend the request to the address in the MOVED response.
Motivation: This can prevent repeated MOVED errors during slot migration or failover.
I see that other clients make use of the address in the MOVED response
- Go redis https://github.com/redis/go-redis/blob/8afc2b9314ea9f719219937f52f05484b4cc0ae2/osscluster.go#L966
- Jedis https://github.com/redis/jedis/blob/64b5aac03bc26a1278796718a49c264d0005164f/src/main/java/redis/clients/jedis/executors/ClusterCommandExecutor.java#L115-L118
- Lettuce https://github.com/lettuce-io/lettuce-core/blob/761d60247159662ce1db8bc20318bc4b573bc010/src/main/java/io/lettuce/core/cluster/ClusterDistributionChannelWriter.java#L136C21-L136C27
But radix discards the address in the MOVED error and refreshes topology by contacting a random node
Line 733 in 3d8bf81
Is this something that can be improved?
Hi @srgsanky , thanks for submitting the issue. I do think this is something which is worth improving, but I'm not sure if I will have the time to do it myself. If you wanted to submit a PR I'd be happy to review and merge it. It should be pretty simple change, basically just taking what Sync()
does internally and inlining it into doInner
, with the call to pool()
accepting an address instead of ""
.