linkedin/brooklin

Start Offsets not taking effect in KafkaConnectorTask

shenodaguirguis opened this issue · 0 comments

Subject of the issue

KafkaConnectorTask extends AbstractKafkaBasedConnectorTask, which enables users to set start offsets, to set its Kafka consumer to start from these offsets. However, these start offsets takes effect only after an initial poll failure throwing NoOffsetForPartitionException. In particular, the AbstractKafkaBasedConnectorTask.pollRecords() calls handleNoOffsetForPartitionException() which in turns seeks to the set start offsets, if present. The intuition is that a new consumer starts with no checkpoints and the very first throw will throw the NoOffsetForPartitionException. However, Kafka consumer has a config auto.offset.reset which when set to 'earliest' or 'latest', will automatically seek to beginning or end of the topic, without throwing exceptions. This config has a default value of earliest. Therefore, if the auto.offset.reset config is omitted to use its default, or if it is set to a value other than 'none', no exception will be thrown and the start offsets won't be used.

Expected behaviour

Brooklin's KafkaConnector datastream should start consumption for the set start offsets when set.

Actual behaviour

Start offsets are ignored.