Build a serverless pipeline to analyze streaming data using AWS Glue, Apache Hudi & Amazon S3

This repository consists a Cloudformation template and pyspark code sample for Glue streaming job to implement following ETL pipeline :

Screenshot 2022-01-29 at 12 21 45 AM

Related AWS Blog : https://aws.amazon.com/blogs/big-data/build-a-serverless-pipeline-to-analyze-streaming-data-using-aws-glue-apache-hudi-and-amazon-s3/

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.