Kafka schedule topic compaction
inanme opened this issue · 4 comments
Can you please verify that compaction takes place on schedule topic assuming that schedule topic is already created as compacted on broker side?
In our case it seems like compaction is not happening for the messages which are deleted by KMS.
when the messages are consumed with kafka-console-consumer , the pair (message and tombstone) is consumed all the time. below key is 0ec45066-495f-4ec6-8a3b-d8a52fdc1a2a
0ec45066-495f-4ec6-8a3b-d8a52fdc1a2a 42018-04-12T14:44:54.20401Z*scheduler-healthcheckH0ec45066-495f-4ec6-8a3b-d8a52fdc1a2a�{"id": "7889273e-455c-4f4d-addb-554d0463f130", "timestamp": "2018-04-12T14:44:54.204"}
0ec45066-495f-4ec6-8a3b-d8a52fdc1a2a null
Hi Mert (@inanme),
Log compaction should be done by Kafka broker. There a couple of broker settings that are important
when you use compacted topics. Can you check your broker properties listed below and let me know what the values are?
- log.cleaner.enable
- log.cleaner.delete.retention.ms
- log.cleaner.min.cleanable.ratio
- log.cleaner.min.compaction.lag.ms
Also, can you use kafka-topics
command and provide a configuration of your topic?
Some of the broker's properties can be override per topic.
We currently have the following:
log.cleaner.enable=true
set on the broker.
Topic:ACCOUNT_JANITOR-SCHEDULE PartitionCount:8 ReplicationFactor:2 Configs:min.cleanable.dirty.ratio=0.01,min.compaction.lag.ms=1,delete.retention.ms=0,segment.ms=100,cleanup.policy=compact,delete
Set on the topic
@inanme @mishamo I've reproduced the behaviour that your are talking about.
Steps:
- send 10 messages with the same key
- wait
- consume from beginning
Result: 2 messages with the same key consumed
However, when I added some random messages with random keys, I've got one message at the end:
Steps:
- send 5 messages with key = A
- send hundreds random messages (
kafka-producer-perf-test
) - send 5 messages with key = A
- send hundreds random messages (
kafka-producer-perf-test
) - wait
- consume from beginning
Result: 1 message with key = A consumed.
I believe it is correct behaviour. Compacted topic has head
and tail
and only tail is compacted. By adding more messages I moved messages with key=A from head
to tail
.
Closing this as @wojda-sky summary above explains the behaviour that was being observed.