/data-pipeline-kafka-k8s

A data pipeline example using Kafka on K8s

Primary LanguageSmartyApache License 2.0Apache-2.0

Data Pipeline Kafka K8s

Prerequisites

Preparation

Install Python 3.8 on Pyenv or Anaconda and execute the following commands:

$ make init     # setup packages (need only once)

Infra Setup

1. K8s Cluster

$ make cluster          # create a k8s cluster (need only once)

You can delete the k8s cluster.

$ make cluster-clean    # delete the k8s cluster

2. Source DB, Target DB, and Message Queue

$ make mongodb-operator     # create a mongodb operator
$
$ make mongodb              # create a mongodb (source DB)
$
$ make postgres             # create a postgres (target DB)
$
$ make redis                # create a redis (message queue)

You can delete mongodb, postgres, and redis.

$ make mongodb-clean        # delete the mongodb
$
$ make postgres-clean       # delete the postgres
$
$ make redis-clean          # delete the redis

3. Kafka Cluster

$ make kafka-operator       # create a kafka operator w/ strimzi
$
$ make kafka-cluster        # create a kafka cluster

You can delete kafka cluster.

$ make kafka-clean        # delete the kafka cluster

4. Schema Registry & Kafka Connect

$ make schema-registry     # create a schema registry
$
$ make kafka-connect       # create a kafka connect

You can delete kafka connect and schema registry.

$ make schema-registry-clean    # delete the schema registry
$
$ make kafka-connect-clean      # delete the kafka connect

For Developers

$ make check          # all static analysis scripts
$ make format         # format scripts
$ make lint           # lints scripts