This Python-based project aims to fetch real-time YouTube metrics like likes, views, comments, and favorites, and then streams this data via Kafka. Also, KSqlDB was used for stream processing and the processed data is then sent to a Telegram bot for real-time notifications.
- System Architecture
- Requirements
- Getting Started
- Configuration
- Running the Code
- How It Works
- Contributing
- Video
- Python 3.10 (minimum)
- Kafka
- Telegram API
- Docker
- Confluent Containers (Zookeeper, Kafka, Schema Registry, Connect, ksqlDB, Control Center)
-
Clone the repository.
git clone https://github.com/airscholar/YoutubeAnalytics.git
-
Install Python dependencies.
pip install -r requirements.txt
-
Make sure you have Docker and Confluent containers set up.
-
Open
config/config.local
and set the following:YOUTUBE_API_KEY
: Your YouTube API KeyPLAYLIST_ID
: The YouTube playlist ID you want to track
-
Set up your Kafka server address in the main script, by default, it's set to
localhost:9092
.
- Start your Kafka and other Confluent services on Docker with
docker compose up -d
- Run the Python script.
python YoutubeAnalytics.py
- Fetches data from YouTube API using the given playlist ID.
- Sends this data to Kafka.
- You should have another component (not included here but in the video) that reads from this Kafka topic and performs real-time analytics using ksqlDB.
- The analytics results are then sent to Telegram for real-time notifications.
Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.