The goal with this example is to simulate an e-commerce platform for computers and video games.
- 3 computer brands of your choice
- 3 video game console brands
- Prices per computer, per console for each brand with their respective commission
- Consider the 5 most important cities in your country of residence
- Consider all payment methods you wish to use
- Consider source of conversion (Advertising, Influencers, Organic)
- Consider all order statuses you want
- Choose the number of stores per city you want, with their respective coordinates
Lets Do it !! 🚀
- Infrastructure
- Demo
- Python Script simulation
- Send Kafka Topics
- Mage Zone
- Apache Druid Deploy
- Grafana
I wanted to simulate with Python the streaming data of my e-commerce platform for computers and video games keeping in mind that I have 1000 unique user, we can go to access to code here :
I've created 2 Kaftka topics in order to analysis purchase behavior of our customer :
- topic_purchase_records: This topic receive the raw purchase records
- topic_druid_grafana_new: This topic receive the
topic_purchase_records
records in order to enrich records to send into the new kafka topic
I've created 2 Streaming pipelines in order to enrich my Kafka topics and to store new records into S3 Buckets :
- Kafka_transform: This pipeline is in charge to edit the created_at field in order t create 2 new columns Date(YYYY-MM--DD) and Hour
- Store_s3_bucket: This pipeline is in charge to send parquet files into the S3 Bucket .
connector_type: amazon_s3
bucket: mage-hackaton
prefix: purchase
file_type: parquet
buffer_size_mb: 5
buffer_timeout_seconds: 100
I've deployed Apache Druid into an EC2 instance in order to connect with the Kafka topic stream and finally be able to analysis as easy as quickly my purchase records .
Here I want to share with you an article that I wrote recently about How can I deploy Apache Druid into EC2 instances
- EC2 Instances :
t2.xlarge
- Go Pricing
When your % Completed Order is below 60%, we'll get a message like that because we need to be able to quickly make a decision when this metric falls below its value