This is an example from the Kafka official documentation which demonstrates how to write a streaming application with the Kafka Streams client library.
The source code is found here.
- Start the local Zookeeper and Kafka servers
- Initiate a topic for the input stream
kafka-topics --create \ --zookeeper 127.0.0.1:2181 \ --replication-factor 1 \ --partitions 1 \ --topic kafka-stream-wordcount-input
- Initiate a topic for the output stream
kafka-topics --create \ --zookeeper 127.0.0.1:2181 \ --replication-factor 1 \ --partitions 1 \ --topic kafka-stream-wordcount-output \ --config cleanup.policy=compact
- Run the word count demo Java application
mvn clean package mvn exec:java -Dexec.mainClass=com.flyer.kafka.WordCountDemo
- Start a producer to write data to the input stream topic
kafka-console-producer --broker-list 127.0.0.1:9092 --topic kafka-stream-wordcount-input
- Start a consumer to observe results from the output stream topic
kafka-console-consumer --bootstrap-server 127.0.0.1:9092 \ --topic kafka-stream-wordcount-output \ --from-beginning \ --formatter kafka.tools.DefaultMessageFormatter \ --property print.key=true \ --property print.value=true \ --property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \ --property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer
- When writing data strings via the producer, the consumer should display word counts continuously in a streaming manner
- This demo application is written with Kafka Streams DSL. It's worthy to rewrite it using low level Processor API.
- The default aggregate function
count
involves state store materialization, which has a commit frequency. Based on thecommit.interval.ms
config setting, there will be a delay in aggregation results in the demo. This is better explained in this SO entry and this article.The semantics of caching is that data is flushed to the state store and forwarded to the next downstream processor node whenever the earliest of commit.interval.ms or cache.max.bytes.buffering (cache pressure) hits.