mailgun/kafka-pixy

Inexplicable offset manager timeouts

Closed this issue · 0 comments

From time to time we see in the logs that the offset manager timeout elapses. The timeout means that we do not commit offsets in time, therefore more messages can be delivered more than once in case of Kafka-Pixy crash. The root cause of this issue is not clear and should be investigated, understood and fixed before it gets out of hand.

request timeout 1.500776182s
github.com/mailgun/kafka-pixy/offsetmgr.(*offsetMgr).run
	/go/src/github.com/mailgun/kafka-pixy/offsetmgr/offsetmgr.go:320
github.com/mailgun/kafka-pixy/offsetmgr.(*offsetMgr).(github.com/mailgun/kafka-pixy/offsetmgr.run)-fm
	/go/src/github.com/mailgun/kafka-pixy/offsetmgr/offsetmgr.go:138
github.com/mailgun/kafka-pixy/actor.Spawn.func1
	/go/src/github.com/mailgun/kafka-pixy/actor/actor.go:98
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:2337

The issue becomes more pronounced when several heavily consuming Kafka-Pixy instance are restarted at once (e.g. during deployment).