uber/tchannel

Peer leaks and CPU spikes

ShanniLi opened this issue · 2 comments

This is to track the issue of peer leaks and CPU spikes in some Hyperbahn nodes.

During the on-call, I observed a spike of CPU and connection FDs in some Hyperbahn nodes. The memory dumps suggested that we are leaking TChannelPeer (130k objects).

The suspicion is that it is related to the aggressive outgoing peer selection logic. However, the amount of the leaked connections doesn't add up to the leaked peers. One reason may be when a non-ephemeral connection is closed, the peer never gets deleted ...

I would consider this issue open until we have more clarity on the cause.

cc: @jcorbin @Raynos @kriskowal

The CPU spikes are caused by the multi-tenant issue in Hyperbahn. Closing this issue since there is another task tracking it.

There is no "multi-tenant" issue -.-