scalatest-embedded-kafka

A library that provides an in-memory Kafka broker to run your ScalaTest specs against. It uses Kafka 1.0.0 and ZooKeeper 3.4.10.

Inspired by https://github.com/chbatey/kafka-unit

Version compatibility matrix

scalatest-embedded-kafka is available on Bintray and Maven Central, compiled for both Scala 2.11 and 2.12.

Scala 2.10 is supported until 0.10.0
Scala 2.11 is supported for all versions
Scala 2.12 is supported from 0.11.0.

How to use

In your build.sbt file add the following dependency: "net.manub" %% "scalatest-embedded-kafka" % "1.0.0" % "test"
Have your Spec extend the EmbeddedKafka trait.
Enclose the code that needs a running instance of Kafka within the withRunningKafka closure.

class MySpec extends WordSpec with EmbeddedKafka {

"runs with embedded kafka" should {

    withRunningKafka {
        // ... code goes here
    }

}

In-memory Zookeeper and Kafka will be instantiated respectively on port 6000 and 6001 and automatically shutdown at the end of the test.

Use without the `withRunningKafka` method

A EmbeddedKafka companion object is provided for usage without the EmbeddedKafka trait. Zookeeper and Kafka can be started and stopped in a programmatic way.

class MySpec extends WordSpec {
  
  "runs with embedded kafka" should {

    EmbeddedKafka.start()
    
    // ... code goes here
    
    EmbeddedKafka.stop()
  }
}

Please note that in order to avoid Kafka instances not shutting down properly, it's recommended to call EmbeddedKafka.stop() in a after block or in a similar teardown logic.

Configuration

It's possible to change the ports on which Zookeeper and Kafka are started by providing an implicit EmbeddedKafkaConfig

class MySpec extends WordSpec with EmbeddedKafka {

"runs with embedded kafka on a specific port" should {

    implicit val config = EmbeddedKafkaConfig(kafkaPort = 12345)

    withRunningKafka {
        // now a kafka broker is listening on port 12345
    }

}

If you want to run ZooKeeper and Kafka on arbitrary available ports, you can use the withRunningKafkaOnFoundPort method. This is useful to make tests more reliable, especially when running tests in parallel or on machines where other tests or services may be running with port numbers you can't control.

class MySpec extends WordSpec with EmbeddedKafka {

"runs with embedded kafka on arbitrary available ports" should {

    val userDefinedConfig = EmbeddedKafkaConfig(kafkaPort = 0, zooKeeperPort = 0)

    withRunningKafkaOnFoundPort(userDefinedConfig) { implicit actualConfig =>
      // now a kafka broker is listening on actualConfig.kafkaPort
      publishStringMessageToKafka("topic", "message")
      consumeFirstStringMessageFrom("topic") shouldBe "message"
    }

}

The same implicit EmbeddedKafkaConfig is used to define custom consumer or producer properties

class MySpec extends WordSpec with EmbeddedKafka {

"runs with custom producer and consumer properties" should {
    val customBrokerConfig = Map("replica.fetch.max.bytes" -> "2000000",
        "message.max.bytes" -> "2000000")
        
    val customProducerConfig = Map("max.request.size" -> "2000000")
    val customConsumerConfig = Map("max.partition.fetch.bytes" -> "2000000")

    implicit val customKafkaConfig = EmbeddedKafkaConfig(
        customBrokerProperties = customBrokerConfig,
        customProducerProperties = customProducerConfig,
        customConsumerProperties = customConsumerConfig)

    withRunningKafka {
        // now a kafka broker is listening on port 12345
    }

}

This works for withRunningKafka, withRunningKafkaOnFoundPort, and EmbeddedKafka.start()

Also, it is now possible to provide custom properties to the broker while starting Kafka. EmbeddedKafkaConfig has a customBrokerProperties field which can be used to provide extra properties contained in a Map[String, String]. Those properties will be added to the broker configuration, be careful some properties are set by the library itself and in case of conflict the customBrokerProperties values will take precedence. Please look at the source code to see what these properties are.

Utility methods

The EmbeddedKafka trait provides also some utility methods to interact with the embedded kafka, in order to set preconditions or verifications in your specs:

def publishToKafka(topic: String, message: String): Unit

def consumeFirstMessageFrom(topic: String): String

def createCustomTopic(topic: String, topicConfig: Map[String,String], partitions: Int, replicationFactor: Int): Unit

Custom producers

It is possible to create producers for custom types in two ways:

Using the syntax aKafkaProducer thatSerializesValuesWith classOf[Serializer[V]]. This will return a KafkaProducer[String, V]
Using the syntax aKafkaProducer[V]. This will return a KafkaProducer[String, V], using an implicit Serializer[V].

For more information about how to use the utility methods, you can either look at the Scaladocs or at the tests of this project.

Custom consumers

Use the Consumer trait that easily creates consumers of arbitrary key-value types and manages their lifecycle (via a loaner pattern).

For basic String consumption use Consumer.withStringConsumer { your code here }.
For arbitrary key and value types, expose implicit Deserializers for each type and use Consumer.withConsumer { your code here }.
If you just want to create a consumer and manage its lifecycle yourself then try Consumer.newConsumer().

Easy message consumption

With ConsumerExtensions you can turn a consumer to a Scala lazy Stream of T and treat it as a collection for easy assertion.

Just import the extensions.
Bring an implicit ConsumerRecord[_, _] => T transform function into scope (some common functions are provided in Codecs).
On any KafkaConsumer instance you can now do:

import net.manub.embeddedkafka.ConsumerExtensions._
import net.manub.embeddedkafka.Codecs.stringKeyValueCrDecoder
...
consumer.consumeLazily[(String, String)]("from-this-topic").take(3).toList should be (Seq(
  "1" -> "one", 
  "2" -> "two", 
  "3" -> "three"
)