Are you looking to enhance your development experience and accelerate the implementation of Kafka Streams? Look no further – Kstreamplify is tailor-made for you!
Kstreamplify is a Java library that empowers you to swiftly create Kafka Streams-based applications, offering a host of additional advanced features.
With Kstreamplify, you can declare your KafkaStreams class and define your topology with minimal effort. Here's all you need to do:
- Overview
- Dependencies
- Features
- Motivation
- Contribution
Wondering what makes Kstreamplify stand out? Here are some of the key features that make it a must-have for Kafka Streams:
-
🚀 Bootstrapping: Automatic startup, configuration, and initialization of Kafka Streams is handled for you. Focus on business implementation rather than the setup.
-
📝 Avro Serializer and Deserializer: Common serializers and deserializers for Avro.
-
⛑️ Error Handling: Catch and route errors to a dead-letter queue (DLQ) topic
-
☸️ Kubernetes: Accurate readiness and liveness probes for Kubernetes deployment.
-
🤿 Interactive Queries: Dive into Kafka Streams state stores.
-
🧪 Testing: Automatic Topology Test Driver setup. Start writing your tests with minimal effort.
Kstreamplify offers three dependencies, all compatible with Java 17 and 21.
To include the core Kstreamplify library in your project, add the following dependency:
<dependency>
<groupId>com.michelin</groupId>
<artifactId>kstreamplify-core</artifactId>
<version>${kstreamplify.version}</version>
</dependency>
If you are using Spring Boot, you can integrate Kstreamplify with your Spring Boot application by adding the following dependency:
<dependency>
<groupId>com.michelin</groupId>
<artifactId>kstreamplify-spring-boot</artifactId>
<version>${kstreamplify.version}</version>
</dependency>
The dependency is compatible with Spring Boot 3.
For both Java and Spring Boot dependencies, a testing dependency is available to facilitate testing:
<dependency>
<groupId>com.michelin</groupId>
<artifactId>kstreamplify-core-test</artifactId>
<version>${kstreamplify.version}</version>
<scope>test</scope>
</dependency>
Kstreamplify offers a wide range of features to simplify the development of Kafka Streams applications.
Kstreamplify simplifies the bootstrapping of Kafka Streams applications by handling the startup, configuration, and initialization of Kafka Streams for you.
To create a Kstreamplify application, define a KafkaStreamsStarter
bean within your Spring Boot context and
override the KafkaStreamsStarter#topology()
method:
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
// Your topology
}
@Override
public String dlqTopic() {
return "DLQ_TOPIC";
}
}
You can define all your Kafka Streams properties directly from the application.yml
file as follows:
kafka:
properties:
bootstrap.servers: localhost:9092
schema.registry.url: http://localhost:8081
application.id: myKafkaStreams
client.id: myKafkaStreams
state.dir: /tmp/my-kafka-streams
acks: all
auto.offset.reset: earliest
avro.remove.java.properties: true
Note that all the Kafka Streams properties have been moved under kafka.properties
.
Whenever you need to serialize or deserialize records with Avro schemas, you can use the SerdeUtils
class as follows:
SerdeUtils.<MyAvroValue>getValueSerde()
or
SerdeUtils.<MyAvroValue>getKeySerde()
Here is an example of using these methods in your topology:
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
streamsBuilder
.stream("INPUT_TOPIC", Consumed.with(Serdes.String(), SerdeUtils.<KafkaPerson>getValueSerde()))
.to("OUTPUT_TOPIC", Produced.with(Serdes.String(), SerdeUtils.<KafkaPerson>getValueSerde()));
}
}
Kstreamplify provides the ability to handle errors that may occur in your topology as well as during the production or deserialization of records and route them to a dead-letter queue (DLQ) topic.
To do it, start by overriding the dlqTopic
method and return the name of your DLQ topic:
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
}
@Override
public String dlqTopic() {
return "DLQ_TOPIC";
}
}
Kstreamplify provides utilities to handle errors that occur in your topology and route them to a DLQ topic automatically.
The processing result is encapsulated and marked as either success or failure. Failed records will be routed to the DLQ topic, while successful records will still be up for further processing.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaPerson> stream = streamsBuilder
.stream("INPUT_TOPIC", Consumed.with(Serdes.String(), SerdeUtils.getValueSerde()));
TopologyErrorHandler
.catchErrors(stream.mapValues(MyKafkaStreams::toUpperCase))
.to("OUTPUT_TOPIC", Produced.with(Serdes.String(), SerdeUtils.getValueSerde()));
}
@Override
public String dlqTopic() {
return "DLQ_TOPIC";
}
private static ProcessingResult<KafkaPerson, KafkaPerson> toUpperCase(KafkaPerson value) {
try {
value.setLastName(value.getLastName().toUpperCase());
return ProcessingResult.success(value);
} catch (Exception e) {
return ProcessingResult.fail(e, value, "Something bad happened...");
}
}
}
The map values processing returns a ProcessingResult<V, V2>
, where:
- The first parameter is the type of the new value after a successful transformation.
- The second parameter is the type of the current value for which the transformation failed.
You can use the following to mark the result as successful:
ProcessingResult.success(value);
Or the following in a catch clause to mark the result as failed:
ProcessingResult.fail(e, value, "Something bad happened...");
The stream of ProcessingResult<V,V2>
needs to be lightened of the failed records by sending them to the DLQ topic.
This is done by invoking the TopologyErrorHandler#catchErrors()
method.
A healthy stream is then returned and can be further processed.
Kstreamplify provides production and deserialization handlers that send errors to the DLQ topic.
kafka:
properties:
default.production.exception.handler: com.michelin.kstreamplify.error.DlqProductionExceptionHandler
default.deserialization.exception.handler: com.michelin.kstreamplify.error.DlqDeserializationExceptionHandler
An Avro schema needs to be deployed in a Schema Registry on top of the DLQ topic. It is available here.
Kstreamplify defines a default uncaught exception handler that catches all uncaught exceptions and shuts down the client.
If you want to override this behavior, you can override the KafkaStreamsStarter#uncaughtExceptionHandler()
method and return your own
uncaught exception handler.
@Override
public StreamsUncaughtExceptionHandler uncaughtExceptionHandler() {
return throwable -> {
return StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.SHUTDOWN_APPLICATION;
};
}
Kstreamplify provides readiness and liveness probes for Kubernetes deployment based on the Kafka Streams state.
By default, the endpoints are available at /ready
and /liveness
.
The path can be customized by setting the following properties:
kubernetes:
readiness:
path: custom-readiness
liveness:
path: custom-liveness
Kstreamplify offers the flexibility to execute custom code through hooks.
The On Start
hook allows you to execute code before starting the Kafka Streams instance.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void onStart(KafkaStreams kafkaStreams) {
// Do something before starting the Kafka Streams instance
}
}
Kstreamplify wants to ease the use of interactive queries in Kafka Streams application.
The "application.server" property value is determined from different sources by the following order of priority:
- The value of an environment variable whose name is defined by the
application.server.var.name
property.
kafka:
properties:
application.server.var.name: MY_APPLICATION_SERVER
- The value of a default environment variable named
APPLICATION_SERVER
. localhost
.
Kstreamplify provides a REST endpoint to retrieve the Kafka Streams topology as JSON.
By default, the endpoint is available at /topology
.
The path can be customized by setting the following properties:
topology:
path: custom-topology
Kstreamplify facilitates deduplication of a stream through the DeduplicationUtils
class, based on various criteria
and within a specified time frame.
All deduplication methods return a KStream<String, ProcessingResult<V,V2>
so you can redirect the result to the
TopologyErrorHandler#catchErrors()
.
Note: Only streams with String keys and Avro values are supported.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaPerson> myStream = streamsBuilder
.stream("INPUT_TOPIC");
DeduplicationUtils
.deduplicateKeys(streamsBuilder, myStream, Duration.ofDays(60))
.to("OUTPUT_TOPIC");
}
}
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaPerson> myStream = streamsBuilder
.stream("INPUT_TOPIC");
DeduplicationUtils
.deduplicateKeyValues(streamsBuilder, myStream, Duration.ofDays(60))
.to("OUTPUT_TOPIC");
}
}
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaPerson> myStream = streamsBuilder
.stream("INPUT_TOPIC");
DeduplicationUtils
.deduplicateWithPredicate(streamsBuilder, myStream, Duration.ofDays(60),
value -> value.getFirstName() + "#" + value.getLastName())
.to("OUTPUT_TOPIC");
}
}
The given predicate will be used as a key in the window store. The stream will be deduplicated based on the predicate.
The Kstreamplify Spring Boot module simplifies the integration of Open Telemetry and its Java agent in Kafka Streams applications by binding all Kafka Streams metrics to the Spring Boot registry.
You can run your application with the Open Telemetry Java agent by including the following JVM options:
-javaagent:/opentelemetry-javaagent.jar -Dotel.traces.exporter=otlp -Dotel.logs.exporter=otlp -Dotel.metrics.exporter=otlp
It also facilitates the addition of custom tags to the metrics, allowing you to use them to organize your metrics in your Grafana dashboard.
-Dotel.resource.attributes=environment=production,service.namespace=myNamespace,service.name=myKafkaStreams,category=orders
All the tags specified in the otel.resource.attributes
property will be included in the metrics and can be observed in
the logs during the application startup.
Kstreamplify eases the use of the Topology Test Driver for testing Kafka Streams application.
You can create a test class that extends KafkaStreamsStarterTest
, override
the KafkaStreamsStarterTest#getKafkaStreamsStarter()
to provide your KafkaStreamsStarter
implementation,
and start writing your tests.
public class MyKafkaStreamsTest extends KafkaStreamsStarterTest {
private TestInputTopic<String, KafkaPerson> inputTopic;
private TestOutputTopic<String, KafkaPerson> outputTopic;
@Override
protected KafkaStreamsStarter getKafkaStreamsStarter() {
return new MyKafkaStreams();
}
@BeforeEach
void setUp() {
inputTopic = testDriver.createInputTopic("INPUT_TOPIC", new StringSerializer(),
SerdeUtils.<KafkaPerson>getValueSerde().serializer());
outputTopic = testDriver.createOutputTopic("OUTPUT_TOPIC", new StringDeserializer(),
SerdeUtils.<KafkaPerson>getValueSerde().deserializer());
}
@Test
void shouldUpperCase() {
inputTopic.pipeInput("1", person);
List<KeyValue<String, KafkaPerson>> results = outputTopic.readKeyValuesToList();
assertThat(results.get(0).value.getFirstName()).isEqualTo("FIRST NAME");
assertThat(results.get(0).value.getLastName()).isEqualTo("LAST NAME");
}
@Test
void shouldFailAndRouteToDlqTopic() {
inputTopic.pipeInput("1", person);
List<KeyValue<String, KafkaError>> errors = dlqTopic.readKeyValuesToList();
assertThat(errors.get(0).key).isEqualTo("1");
assertThat(errors.get(0).value.getContextMessage()).isEqualTo("Something bad happened...");
assertThat(errors.get(0).value.getOffset()).isZero();
}
}
Developing applications with Kafka Streams can be challenging and often raises many questions for developers. It involves considerations such as efficient bootstrapping of Kafka Streams applications, handling unexpected business issues, and integrating Kubernetes probes, among others.
To assist developers in overcoming these challenges, we have built this library. Our aim is to provide a comprehensive solution that simplifies the development process and addresses common pain points encountered while working with Kafka Streams.
We welcome contributions from the community! Before you get started, please take a look at our contribution guide to learn about our guidelines and best practices. We appreciate your help in making Kstreamplify a better library for everyone.