Deadlock in connection pool
Gilthoniel opened this issue · 2 comments
Expected behavior
The client should properly handle a cluster that is not ready yet and retry until it gets healthy again.
Actual behavior
When the client is attempting to reconnect or to create new producers, consumers or readers but the service is not ready, it has a chance to block the connection pool.
Steps to reproduce
ConnectionClosed
callbacks have a chance of blocking the connection pool because the GetConnection
of the pool may close a connection when the state has changed, which happens when the cluster is not ready. In our case, we were observing a lot of closing because right after getting a connection to the broker, it was closing due to ServiceNotReady since too many bookies were down.
System configuration
Pulsar version: 3.0.5
Pulsar client: 13.1
Could we reproduce this issue?
Hardly in a simple integration test as it requires a healthy client against a failing cluster.
I can give more details however as it happened recently again. It deadlocked again because we received plenty of
Broker notification of Closed producer: 7
which is calling ConnectionClosed
and filling up the channel.