/siddhi-io-kafka

Siddhi IO Kafka

Primary LanguageJavaApache License 2.0Apache-2.0

siddhi-io-kafka

The siddhi-io-kafka extension is an extension to Siddhi. This implements siddhi kafka source and sink that can be used to receive events from a kafka cluster and to publish events to a kafka cluster.

The Kafka Source receives records from a topic with a partition for a Kafka cluster which are in format such as text, XML and JSON. The Kafka Source will create the default partition for a given topic, if the topic is not already been created in the Kafka cluster.

The Kafka Sink publishes records to a topic with a partition for a Kafka cluster which are in format such as text, XML and JSON. The Kafka Sink will create the default partition for a given topic, if the topic is not already been created in the Kafka cluster. The publishing topic and partition can be a dynamic value taken from the Siddhi event."

Find some useful links below:

Latest API Docs

Latest API Docs is 4.2.0.

How to use

Using the extension in WSO2 Stream Processor

Prerequisites for using the feature

  • Download and install Kafka and Zookeeper.

  • Start the Apache ZooKeeper server with the following command: bin/zookeeper-server-start.sh config/zookeeper.properties.

  • Start the Kafka server with the following command: bin/kafka-server-start.sh config/server.properties

  • Convert and copy the Kafka client jars from the <KAFKA_HOME>/libs directory to the <SP_HOME>/lib directory as follows.

    • Create a directory (SOURCE_DIRECTORY) in a preferred location in your machine and copy the following JARs to it from the <KAFKA_HOME>/libs directory.

      • kafka_2.11-*.jar
      • kafka-clients-*.jar
      • metrics-core-*.jar
      • scala-library-2.11.8.jar
      • scala-parser-combinators_2.11*.jar (if exist)
      • zkclient-*.jar
      • zookeeper-*.jar
    • Create another directory (DESTINATION_DIRECTORY) in a preferred location in your machine.

    • To convert all the Kafka jars you copied into the <SOURCE_DIRECTORY>, issue the following command.

      • For Windows: <SP_HOME>/bin/jartobundle.bat <SOURCE_DIRECTORY_PATH> <DESTINATION_DIRECTORY_PATH>
      • For Linux: <SP_HOME>/bin/jartobundle.sh <SOURCE_DIRECTORY_PATH> <DESTINATION_DIRECTORY_PATH>
    • Copy the converted files from the <DESTINATION_DIRECTORY> to the <SP_HOME>/lib directory.

    • Copy the jars that are not converted from the <SOURCE_DIRECTORY> to the <SP_HOME>/samples/sample-clients/lib directory.

  • You can use this extension in the latest WSO2 Stream Processor that is a part of WSO2 Analytics offering, with editor, debugger and simulation support.

  • This extension is shipped by default with WSO2 Stream Processor, if you wish to use an alternative version of this extension you can replace the component jar that can be found in the <STREAM_PROCESSOR_HOME>/lib directory.

Using the extension as a java library

  • This extension can be added as a maven dependency along with other Siddhi dependencies to your project.
     <dependency>
        <groupId>org.wso2.extension.siddhi.io.kafka</groupId>
        <artifactId>siddhi-io-kafka</artifactId>
        <version>x.x.x</version>
     </dependency>

Jenkins Build Status


Branch Build Status
master Build Status

Features

  • kafka (Sink)

    A Kafka sink publishes events processed by WSO2 SP to a topic with a partition for a Kafka cluster. The events can be published in the TEXT XML JSON or Binary format.
    If the topic is not already created in the Kafka cluster, the Kafka sink creates the default partition for the given topic. The publishing topic and partition can be a dynamic value taken from the Siddhi event.
    To configure a sink to use the Kafka transport, the type parameter should have kafka as its value.

  • kafkaMultiDC (Sink)

    A Kafka sink publishes events processed by WSO2 SP to a topic with a partition for a Kafka cluster. The events can be published in the TEXT XML JSON or Binary format.
    If the topic is not already created in the Kafka cluster, the Kafka sink creates the default partition for the given topic. The publishing topic and partition can be a dynamic value taken from the Siddhi event.
    To configure a sink to publish events via the Kafka transport, and using two Kafka brokers to publish events to the same topic, the type parameter must have kafkaMultiDC as its value.

  • kafka (Source)

    A Kafka source receives events to be processed by WSO2 SP from a topic with a partition for a Kafka cluster. The events received can be in the TEXT XML JSON or Binary format.
    If the topic is not already created in the Kafka cluster, the Kafka sink creates the default partition for the given topic.

  • kafkaMultiDC (Source)

    The Kafka Multi-Datacenter(DC) source receives records from the same topic in brokers deployed in two different kafka clusters. It filters out all the duplicate messages and ensuresthat the events are received in the correct order using sequential numbering. It receives events in formats such as TEXT, XML JSON and Binary`.The Kafka Source creates the default partition '0' for a given topic, if the topic has not yet been created in the Kafka cluster.

How to Contribute

Contact us

Support

  • We are committed to ensuring support for this extension in production. Our unique approach ensures that all support leverages our open development methodology and is provided by the very same engineers who build the technology.

  • For more details and to take advantage of this unique opportunity contact us via http://wso2.com/support/.