uber/tchannel

Adding new Hyperbahn hosts may degrade performance till clients receive new bootstrap file

prashantv opened this issue · 5 comments

Currently, an incoming connection does not identify itself as any service, and so it is not automatically added to any peer lists.

If new Hyperbahn hosts are added, then a service's affinity may change to include some of the new hosts. However, incoming connections from this host would not be used since they not be part of the hosts file yet (or the client has not refreshed the hosts file). This would lead to a degradation in service quality since new nodes are reducing the set of usable affinity nodes.

One solution is for incoming connections to add service name (or service names) to the init headers.

If the incoming connection was from Hyperbahn, we could add this connection to the Hyperbahn peer list. The client implementations can share the peer list for Hyperbahn with subchannels for other services.

Solving this at the tchannel protocol init level seems a bit over general to me. I'd rather add a hyperbahn-level rpc that:

  • would be for arbitrary "please add these list of peers to your list of hyperbahn peers"
  • gets sent on new connection initiation
  • can get sent at any point in the future

We can then tune how much of the total host list we tell each client about, and how often.

The Hyperbahn client would then no longer be a client since it would need to register a server -- which isn't terrible but I think it over-complicates the clients. We would also need to think about:

  • Do we always send a list of all Hyperbahn nodes?
  • Is this duplicated k times by every affinity node?

The general solution seems cleaner, since it also works in a world where services could just connect to each other, and they would be able to send calls back on those connections without needing any extra configuration. It's also something that is a much smaller change (e.g. add a header).

You just need to register a handler; you don't need a listening socket.
Sockets are bidirectional by design.

Making this outside of the init req/res handshake is important for services that only deploy weekly.