whatyouhide/xandra

Initial connection pool inconsistent with parameters

Closed this issue · 3 comments

We noticed the following problem.

Let's say we have local nodes L1, L2, L3 and remote nodes R1, R2, R3.

When we start a connection with:

Xandra.Cluster.start_link(
  nodes: ["L1", "L2", "L3"],
  target_pools: 3,
  load_balancing: {Xandra.Cluster.LoadBalancingPolicy.DCAwareRoundRobin, []}
)

the control connection connects to L1 and does "SELECT * FROM system.peers". With the order returned from the query, up to target_pools (3 in our case) connections are opened in that order, since the control connection sends send(data.cluster, {:host_up, host}) in that order. Let's say "SELECT * FROM system.peers" returns:

  1. R1
  2. L2
  3. L3
  4. R2
  5. R3

the pools that we have open will be to L1, R1 and L2. Now, when we execute a query, since we use DCAwareRoundRobin, only L1 and L2 gets used. We could fix the problem with target_pools: 6, but that will lead to having 3 pools open that we don't use (unless local DC goes down).

Maybe we could introduce a logic in Xandra.Cluster that prioritizes the initial :nodes. An idea could be that if a :host_up is in the initial :nodes, find first host in pools that isn't in :nodes, shut it down, open a connection to the new :host_up host.

@harunzengin good catch.

Maybe we could introduce a logic in Xandra.Cluster that prioritizes the initial :nodes

I don't want to do that because the initial nodes are mostly a way to set contact points, but we don't always know how to match an initial node to an entry in system.peers. For example, you might be passing your L2 node as cassandra2.example.net, but in system.peers that's just an IP. Does that make sense?

I believe the solution to this is to introduce another message that the control connection sends to the cluster: {:discovered_nodes, [Host.t()]}. I had this initially, but I removed it in favor of just :host_up. It would work like this:

sequenceDiagram
  participant c as Cluster
  participant cc as Control connection
  participant cass as Cassandra

  c->>c: Initiate load-balancing policy  
  cc->>+cass: SELECT * FROM system.peers
  cass->>-cc: [Host.t()]
  cc->>cc: Figure out diff with current nodes, and which are new nodes
  cc->>c: {:discovered_nodes, [Host.t()]}
  loop Every new host
    c->>c: lbp.host_up(host)
  end 
Loading

This way, the initial list of nodes will be transmitted all at once to the cluster, which will add the nodes all at once to the load-balancing policy before starting pools. The DC-aware LBP will then be able to prioritize the local nodes.

What do you think?

@whatyouhide yeap, this makes sense. And about the comment in #288 (comment) maybe we can build in a mechanism to query that server before marking it as up? Does that make sense?

Let's start with the solution in my comment above, because they are two different problems.