gerritjvv/kafka-fast

Kafka reset offsets (or possible loose data) if a broker goes down after writing

gerritjvv opened this issue · 2 comments

I've run several tests and found that after a successful run of the test kafka for some reason resets its own offsets on a certain partition. I've confirmed this over several runs and its always reproducible.

The pattern is:
Test 1 OK , redis offset N kafka offset N and after the test kafka offset is reset to N - 50
Test 2 Fails, redis and kafka offsets are the same
Test 3 OK, redis offset N kafka offset N and after the test kafka offset is reset to N - 50

The reset happens after the test completes, the strange thing is that the kafka and redis offsets are the same till near to the end of the test and either at the end or after the test the offset is decremented in kafka by 50.

The only resolution I have for this is that if I encounter such a situation where the redis offset is bigger than the max kafka offset I switch the two, this would work for this particular situation but I do not know if it would create duplicate data for the future. Guess its better than having missing data.

My complete analysis is:

Redis offsets
start /0 150 /1 100
second /0 200 /1 150
third /0 250 /1 200
fourth /0 300 /1 200

Kafka offsets:

First run

/0 offset 150, :all-offsets (150 100 0)
/1 offset 100, :all-offsets (100 0)

Test Results: OK

Second run
/0 :offset 200, :all-offsets (200 100 0)
/1 :offset 150, :all-offsets (150 0)

Test Results: Failure

third run
/0 :offset 250, :all-offsets (250 200 100 0)
/1 :offset 200, :all-offsets (200 150 0)

Test Results: OK

Analyse:
After a successful run
offset request to kafka brokers:
srv3 :offset 150, :all-offsets (150 0), :error-code 0, :locked false, :partition 1
srv2 :offset 150, :all-offsets (150 0), :error-code 0, :locked false, :partition 1
srv1 :offset 150, :all-offsets (150 0), :error-code 0, :locked false, :partition 1
but in redis we have saved up to offset 200 for kafka

What we know:
Looking at how the offsets increase between each run for patition 1 we can see that they logically increase by 50
just as for partition 0. So we can deduce that in fact the final offset saved in redis is in fact correct (logically)
There is something strange between the kafka offsets in the first and the second run it goes
first run: (100 0) and second run: (150 0) instead of the expected (150 100 0) as with partition 0.

fourth run

/0 :offset 300, :all-offsets (300 200 100 0)
/1 :offset 200, :all-offsets (200 150 0)

Test Results: Failed

Analyse:

Offsets in kafka are the same as in redis
Only offsets from partition 0 from 250 - 299 inclusive were read, partition 1 was never read
This is because for partition 1 redis had 200 as read offset and kafka had max offset at 150,
this mean all data written to kafka was from 150 - 200, so the last 50 offsets are ignored.
There is no explanation as to why kafka has first max offset 200 then after the third run completes
it resets the max offset to 150.

the test done is:

start kafka + zk
start consumer
send 100 records
wait 120 seconds
stop one kafka broker
stop consumer
count records
stop kafka + zk cluster
start kafka + zk cluster
start consumer
send 100 records
wait 120 seconds
stop consumer
count records

fixed in 3.5.0