pipelinedb/pipeline_kafka

Memory leak?

Closed this issue · 1 comments

On the 0.8.2.2 branch we're seeing what appears to be a memory leak (screenshot attached), with allocations originating in a librdkafka (branch 0.9.1) thread. It only appears to happen during peak throughput, and it looks to me that we're properly deallocating messages when we're done with them, so this one has been a bit puzzling.

massif

For reference, here's the main consumer loop.

I spoke with @edenhill on Gittr, and basically librdkafka will prefetch up to queued.max.messages.kbytes * num_partitions bytes of messages in memory. In this case we were seeing this behavior on a 90-partition topic, which with the default value for queued.max.messages.kbytes would yield queueing up ~80GB of messages in memory. Lowering that setting should resolve the issue.