I build a streaming data solution for log-based CDC with MySQL, AWS Kinesis, AWS Lambda, and Python.
Build a data flow as architecture as the following steps:
CREATE TABLE users(user_id INTEGER, first_name VARCHAR(200), last_name VARCHAR(200), PRIMARY KEY (user_id));
CREATE TABLE wages(user_id INTEGER, wage integer, PRIMARY KEY (user_id));
Step 3: Run Python script to catch the CDC events (insert, update, and delete) from MySQL source database and send events to Kinesis stream
Python script is at Python/mysql_to_kinesis.py
CREATE TABLE users(user_id INTEGER, first_name VARCHAR(200), last_name VARCHAR(200), PRIMARY KEY (user_id));
CREATE TABLE wages(user_id INTEGER, wage integer, PRIMARY KEY (user_id));
CREATE TABLE user_wages(user_id INTEGER, full_name VARCHAR(200), wage integer, PRIMARY KEY (user_id));
Step 5: Create an AWS Lambda function with trigger on Kinesis stream and include Python script to consume CDC events to RDS MySQL target database
Lambda Python script is at Python/lambda.py