This project was created as part of Udacity's Data Engineering nanodegree. Project flow was:
- take various csv files made up of logs,
- aggregate them together,
- and finally clean and transform them into 3 analytics tables in a local Cassandra cluster.
This repo consists of:
- jupyter notebook, which contains code for this project,
- csv file which contains the aggregated dataset created in the jupyter notebook and used for loading to Cassandra.