Example implementations of
UDFs (user-defined functions, like FLOOR
, MASK
) and
UDAFs (user-defined aggregate functions, like SUM
, COUNT
)
for KSQL.
Table of Contents
Compatible KSQL versions:
- The examples in this repository are built against KSQL 5.3.1 (see pom.xml) for Apache Kafka 2.3.1 and Confluent Platform 5.3.1.
Requirements to locally build, test, package the UDF/UDAF examples:
- Java 8+
- Maven 3.0+
To write UDFs/UDAFs (details), you can use the following examples as a starting point. Simply fork this repository and add or modify the examples as you see fit. Of course, you can also contribute examples to this repository by sending a pull request.
Example | Type | Description |
---|---|---|
MULTIPLY(col1, col2) (tests) |
UDF | Multiplies two numbers |
SUM_NULL(col1) (tests) |
UDAF | Sums values in a colum, treating null as 0 (zero) |
To unit-test the UDFs/UDAFs:
$ mvn clean test
To package the UDFs/UDAFs (details):
# Create a standalone jar ("fat jar")
$ mvn clean package
# >>> Creates target/ksql-udf-demo-1.0-jar-with-dependencies.jar
To deploy the packaged UDFs/UDAFs to KSQL servers, refer to the
KSQL documentation on UDF/UDAF.
You can verify that the UDFs/UDAFs are available for use by running SHOW FUNCTIONS
, and show the details of
any specific function with DESCRIBE FUNCTION <name>
.
To use the UDFs/UDAFs in KSQL (details):
CREATE STREAM numbers (b1 BIGINT, b2 BIGINT, d1 DOUBLE, d2 DOUBLE, v VARCHAR)
WITH (VALUE_FORMAT = 'JSON', KAFKA_TOPIC = 'numbers');
SELECT MULTIPLY(b1, b2), MULTIPLY(d1, d2) FROM numbers;
SELECT SUM_NULL(b1), SUM_NULL(d1), SUM_NULL(v) FROM numbers;
- Head over to the KSQL documentation.
- Mitch Seymour's talk The Exciting Frontier of Custom KSQL Functions (slides), Kafka Summit London 2019, which includes UDF usage for machine learning as well as a POC for writing UDFs in non-JVM languages like Ruby, Python
See LICENSE for licensing information.
Bring in as a String, second parameter should be a list of fields we want. Parse the string, then grab fields passed in as the second parameter. But is there anyway that we can specify the schema within the @Udf decorater?
Maybe this just becomes