/Kag

Kaggle Coursera Final Project

Primary LanguageJupyter Notebook

kaggle_coursera

migai/Kag Repository for Coursera's Competitive Data Science course, taught by NRU-HSE
Team: Andreas Theodoulou and Michael Gaidis


Data files for the final competition reside in the readonly/final_project_data directory


template_Kaggle_Coursera_Final_Assignment.ipynb Open In Colab is a starter notebook that aids in using Google Colab, by loading all data and code helper files from this GitHub repository into the Colab environment.
This should be forked and used at your discretion, and then appropriately renamed with a version number and stored in the "ipynb_versions" directory or in a new branch you create. Only modify the "template" ipynb in the top directory if you are including modifications that are key for every ipynb notebook we use in the competition. Otherwise, clone or fork from the modified version you have been working on.


There is a "kaggle_utils_at_mg.py" starter file in the "helper_code" directory as well. This is (at present) just a template for how we could store snippets of code to make our Jupyter notebooks more readable.


The "data_output" directory can be used to store modified data (e.g., with extra feature columns) or can make use of better file storage / compression and pickling to include any important hyperparameters you'd like to keep, etc.


For some stimulating ideas on useful helper code files, and how to proceed with EDA, feature generation, and modeling in the competition, look in the "readonly/kaggletils" and "readonly/examples" directories, which hold files associated with our Coursera instructors and other pioneers in this (or other) Kaggle competition.

EDA from DennisE: Open In Colab

Modeling from DennisE: Open In Colab