kafka-twitter-integration

Kafka Twitter Integration

Problem Statement:
A user wants to get a tragetted tweet (say bitcoin) from Twitter and do the data mining using elastic search.
Data mining includes how many times a tweet is reshared or liked or commented.

  1. Kafka : 7.10.2 with Multicluster + Replication setup(Active <—>Active) using MirrorMaker
  2. Kafka Connect for scalable and reliably streaming data between Apache Kafka® and other data systems
  3. Kafka Streams for transforming data from Kafka Topic to another topic in Real Time
  4. Debezium for Change Data Capture (CDC)- https://lnkd.in/eyHHdZs
  5. Kafka Schema Registry for establishing Contracts between Producers and Consumers.
  6. Kafka Producer using Twitter (https://lnkd.in/ewhDfdN)
  7. Kafka Consumer and storing data in Elastic Search using https://Bonsai.io

image