This repo gives an introduction to setting up streaming analytics using open source technologies. We'll use Apache {Kafka, Superset, Druid, Airflow} to set up a system that allows you to get a deeper understanding of the behaviour of your customers. Apache Druid
View System
https://www.docker.com/
git clone https://github.com/apot-group/real-time-analytic.git
cd real-time-analytic && docker-compose up
Service | URL | User/Password |
---|---|---|
Druid Unified Console | http://localhost:8888/ | None |
Druid Legacy Console | http://localhost:8081/ | None |
Superset | http://localhost:8088/ | docker exec -it superset bash superset-init |
Airflow | http://localhost:3000/ | a-airflow/app/standalone_admin_password.txt |
- Airflow dags at a-airflow/app/dags/demo.py each one min sent message to kafka 'demo' topic with data of list coin ['BTC', 'ETH', 'BTT', 'DOT'] the structure of data message like below.
{
"data_id" : 454,
"name": 'BTC',
"timestamp": '2021-02-05T10:10:01'
}
- From druid load data from kafka
kafka:9092
, choicedemo
topic and config data result table
- From superset add druid like database sqlalchemy uri:
druid://broker:8082/druid/v2/sql/
. more detail at Superset-Database-Connect - Create Chart and dashboard on superset from
demo
table. - Enjoy! 🔥 🔥 🔥
- Email-1: duynnguyenngoc@hotmail.com - Duy Nguyen ❤️ ❤️ ❤️