Initial connection pool inconsistent with parameters
Closed this issue · 3 comments
We noticed the following problem.
Let's say we have local nodes L1
, L2
, L3
and remote nodes R1
, R2
, R3
.
When we start a connection with:
Xandra.Cluster.start_link(
nodes: ["L1", "L2", "L3"],
target_pools: 3,
load_balancing: {Xandra.Cluster.LoadBalancingPolicy.DCAwareRoundRobin, []}
)
the control connection connects to L1
and does "SELECT * FROM system.peers"
. With the order returned from the query, up to target_pools
(3 in our case) connections are opened in that order, since the control connection sends send(data.cluster, {:host_up, host})
in that order. Let's say "SELECT * FROM system.peers"
returns:
R1
L2
L3
R2
R3
the pools that we have open will be to L1
, R1
and L2
. Now, when we execute a query, since we use DCAwareRoundRobin
, only L1
and L2
gets used. We could fix the problem with target_pools: 6
, but that will lead to having 3 pools open that we don't use (unless local DC goes down).
Maybe we could introduce a logic in Xandra.Cluster
that prioritizes the initial :nodes
. An idea could be that if a :host_up
is in the initial :nodes
, find first host in pools
that isn't in :nodes
, shut it down, open a connection to the new :host_up
host.
@harunzengin good catch.
Maybe we could introduce a logic in
Xandra.Cluster
that prioritizes the initial:nodes
I don't want to do that because the initial nodes are mostly a way to set contact points, but we don't always know how to match an initial node to an entry in system.peers
. For example, you might be passing your L2
node as cassandra2.example.net
, but in system.peers
that's just an IP. Does that make sense?
I believe the solution to this is to introduce another message that the control connection sends to the cluster: {:discovered_nodes, [Host.t()]}
. I had this initially, but I removed it in favor of just :host_up
. It would work like this:
sequenceDiagram
participant c as Cluster
participant cc as Control connection
participant cass as Cassandra
c->>c: Initiate load-balancing policy
cc->>+cass: SELECT * FROM system.peers
cass->>-cc: [Host.t()]
cc->>cc: Figure out diff with current nodes, and which are new nodes
cc->>c: {:discovered_nodes, [Host.t()]}
loop Every new host
c->>c: lbp.host_up(host)
end
This way, the initial list of nodes will be transmitted all at once to the cluster, which will add the nodes all at once to the load-balancing policy before starting pools. The DC-aware LBP will then be able to prioritize the local nodes.
What do you think?
@whatyouhide yeap, this makes sense. And about the comment in #288 (comment) maybe we can build in a mechanism to query that server before marking it as up? Does that make sense?
Let's start with the solution in my comment above, because they are two different problems.