/moringa-wk7

Building data pipelines

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

moringa-wk7

Building data pipelines

Week 7 - Monday project - Data pipelines with python


Steps:
1- Data was extracted from csv files shared by customer
2- Data transformation was done. E.g: imputing missing promo values, data type conversion
3- Aggregation was done to explore data characteristics. E.g: calculation of mean values
4- Cleaned data was exported to csv file on local disk for further analysis

URL: https://github.com/joekibz/moringa-wk7/blob/main/%5BJThiongo%5D_Data_Pipelines_with_Python.ipynb

Week 7 - Tuesday project - Data pipelines with python and postgreSQL


Steps:
1- Data was extracted from csv files shared by customer
2- Data transformation was done. E.g: dropping duplicate date columns, data type conversion
3- Aggregation was done to explore data characteristics. E.g: calculation mean sensor reading
4- Cleaned data was uploaded to postgreSQL database hosted on google cloud

URL: https://github.com/joekibz/moringa-wk7/blob/main/%5BJThiongo%5D_Data_pipelines_with_Python_and_PostgreSQL.ipynb

Week 7 - Wednesday project - Data pipeline with python and mongodb


Steps:
1- Data was extracted from csv files shared by customer
2- Data transformation and enrichment was done. E.g: data type conversion, adding duration_minutes column
3- Data loading functions were implemented - without compression | with compression | aggregation
4- Data was loaded to mongodb collections hosted on mongodb.com

URL: https://github.com/joekibz/moringa-wk7/blob/main/%5BJThiongo%5D_Data_pipelines_with_Python_and_MongoDB.ipynb