/6nomads-interview-project

Interview project repository for data analysis and prediction for 6Nomads data.

Primary LanguageJupyter NotebookMIT LicenseMIT

Interview Project: 6Nomads


Description

This repository houses my data science interview project for 6Nomads.

My task is to run multi-class predictions off of given training and testing datasets. I am permitted to use Python and Jupyter technologies to create a end-to-end data pipeline including algorithms for analysis, processing, and modeling.

To conserve data security, I have not included any of the datasets in my final submission of my repository. Instead, I have included a concise tree structure to depict my project hierarchy, including the organization of my data files.

Please enjoy!

Project Hierarchy

6nomads-interview-project
│   README.md
│   LICENSE
│   .gitignore
│   ...
│
└───notebooks
│   │   01-exploratory-data-analysis.ipynb
│   │   02-intermediate-data-processing.ipynb
│   │   03-predictive-data-modeling.ipynb
│   
└───structures
│   │   custom_structures.py
│   │   dataset_preprocessor.py
│   │   dataset_processor.py
│   
└───data
│   │
│   └───external
│   │   │   train.csv
│   │   │   test.csv
│   │
│   └───interim
│   │   │   train_i.csv
│   │   │   test_i.csv
│   │
│   └───processed
│       │   
│       └───X
│       │   │
│       │   └───processed
│       │   │   │   train_pXp.csv
│       │   │   │   test_pXp.csv
│       │   │
│       │   └───scaled
│       │   │   │   train_pXs.csv
│       │   │   │   test_pXs.csv
│       │   │
│       │   └───reduced
│       │   │   │   train_pXr.csv
│       │   │   │   test_pXr.csv
│       │
│       └───y
│       │   │
│       │   └───processed
│       │       │   train_pyp.csv
│       │       │   test_pyp.csv

Dependencies

License

The content of this project itself and the source code used to format and display that content are both licensed under the MIT license.


Constructed and documented by Aakash "Kash" Sudhakar (2019).