/data-project-framework

A high-level conceptual framework to aid in working on Data-related Projects.

Primary LanguageJupyter NotebookMIT LicenseMIT

Data Project Framework

A High-level Conceptual Framework to aid in working on Data-related Projects.

Acknowledgements

This repository is largely inspired from Abhishek Thakur's ML Framework Repo.

Dataset Used

The dataset used for applying this framework is the Graduate Admission 2 dataset available on Kaggle.

What each directory means

  • dataset : All raw and cleaned data is stored here
  • notebooks : Contains Jupyter notebooks for EDA, Data Storytelling etc.
  • src : Contains Python scripts used predominantly for predictive modelling
  • utils : Contains utility scripts to aid in faster analysis
  • models : All trained models are stored here (as .pkl files)
  • output : Contains output files (submissions, images, reports etc.)

Using the Repo

The Repository's directory structure can be used to guide your own data science and machine learning projects.