Event Streaming with Kinesis

This project was inspired by graduate classes in data engineering. Great part of the material was built using the Rony package and Professor Neylson Crepalde's class codes.

The main project idea was to reproduce a simple dataflow feed through event simulation. The Kinesis was responsible for collecting events and inserting them into the bucket. The Glue Crawler service was responsible to deliver data to Athena. Cloud Watch services were also used.

Solution Architecture:

kinesis

  1. Create all the secrets to your gihtub repository.
  2. Create a key-pair in EC2.
  3. Dont forget to change your account ID inside policy ARN!
  4. You must have at least one bucket with your raw data to process.
  5. Create pull request on dev branch in order to test all structures in CI/CD
  6. Verify at github actions if everything is ok.
  7. Check at AWS if all structures and products were created. 7 Run simulations_to_kinesis.py in order to generate random data to be collected by kinesis.
  8. Run terraform init
  9. Run terraform destroy (it wont destroy s3 bucket that is not empty)