Robust machine learning on streaming data using Kafka and Tensorflow-IO
This tutorial focuses on streaming data from a Kafka cluster into atf.data.Dataset which is then used in conjunction with tf.keras for training and inference.
Kafka is primarily a distributed event-streaming platform which provides scalable and fault-tolerant streaming data across data pipelines. It is an essential technical component of a plethora of major enterprises where mission-critical data delivery is a primary requirement.
It is a collection of file systems and file formats that are not available in TensorFlow's built-in support. It provides useful extra Dataset, streaming, and file system extensions, and is maintained by TensorFlow SIG-IO.