Rust core can't connect to a replica
Closed this issue · 4 comments
Describe the bug
python3 utils/cluster_manager.py start -p 6380 6381
starts 2 nodes (not a cluster!)
master/replica assignment occurs automatically by the servers (they communicate to each other).
Then, when a connection request sent thru UDS with only one server address and this server is a replica, rust core lib fails to connect.
Expected Behavior
Connection should succeed
Current Behavior
2024-01-22T23:37:26.982230Z DEBUG logger_core: connection - new socket listener initiated
2024-01-22T23:37:27.527979Z INFO logger_core: Connection configuration -
Addresses: localhost:6380
TLS mode: No TLS
Standalone mode
Read from Replica mode: Only primary
Protocol: RESP3
2024-01-22T23:37:27.536844Z DEBUG logger_core: connection creation - Attempting connection to host: "localhost" port: 6380
2024-01-22T23:37:27.540984Z DEBUG logger_core: connection creation - Connection to localhost:6380 created
2024-01-22T23:37:27.541108Z ERROR logger_core: ClientCreationError - ConnectionError - ConnectionError(Standalone(Received errors:
))
2024-01-22T23:37:27.541126Z ERROR logger_core: client creation - Connection error: Standalone(Received errors:
)
Reproduction Steps
python3 utils/cluster_manager.py start -p 6380 6381
Then
var regularClient =
RedisClient.CreateClient(
RedisClientConfiguration.builder()
.address(NodeAddress.builder().port(6380).build())
.build())
.get(10, TimeUnit.SECONDS);
Note: this is flakey, because master/replica election is a random process and on one run port 6380 may be occupied by master, on another test run it could be used but the replica
See different responses to HELLO
message in tcpdump/wireshark network dump:
on failure
on success
tcpdump.zip
Possible Solution
Workarounds
- After fix for #848 start server with only one node and use it
- Pass both ports to rust core lib on connection request
Additional Information/Context
No response
Client version used
N/A
Redis Version
6.0.16
OS
Linux
Language
Python
Language Version
N/A
Cluster information
No response
Logs
No response
Other information
No response
This is an intentional behavior - we don't want to leave the client in a state that doesn't allow the user to use some of its functionality.
A user may intetionally want to connect to a replica node to get/update node configuration or stats or whatever. Why not?
Why not?
we don't want to leave the client in a state that doesn't allow the user to use some of its functionality.
This is why not. The user might not be aware that they're connecting to a replica, or a more complex scenario - the user may try to connect to several nodes, some will fail and only some replicas will succeed. The user then will have a client that is unable to perform actions, without being aware of it.
That is what connection response for. It may contain something more verbose like:
- connected to all nodes
- connected to several nodes
- connected to replica only
- etc