sgroschupf/zkclient

UknonwnHostException in reconnect causes livedeath

Closed this issue · 0 comments

I've run into a somewhat nasty issue with zkclient while running Kafka.

During a reconnect, the org.apache.zookeeper.ZooKeeper constructor threw an UknownHostException(https://issues.apache.org/jira/browse/ZOOKEEPER-1576). This caused Kafka(as it would any ZkClient consumer) to loose all Zookeeper connectivity silently.

As far as I can see, the zkclient consumer has no recourse in this situation. Meaning, no state change event is fired after stateChangedEvent(Expired). There does not seem to be a way for the client to tell that something has gone wrong, short of kicking off some sort of timed reconnect check on session expiration.

Would it make sense to add something to the interface that would report session re-connection errors to the consumer? Is there some way to handle this situation that I am not seeing?