This is prepared as part of MIDS W207's final project at University of California Berkeley.
- Age_Gender_Assessment-2: Notebook containing assessment of impact of Age and Gender on the raw features using Cat Boost Model
- EDA: Notebook containing exploratory data analysis and assessment using LGBM and Logistic Regression models
- Feature Engineered Data-NaiveBayes: Notebook containing assessment using Naive Bayes model
- Feature Engineered Data-RandomForest: Notebook containing assessment using Random Forest model
- Feature Engineered Data: Notebook containing step by step process of Feature Engineering
- LogRegTrial_3: Notebook containing Logisitic Regression model on raw features reflecting data related errors observed
- README: Summary of project files
- W207_Project_Retrieval_and_Ranking_TFRS: Notebook containing auto feature retrieval and collaborative ranking using TenserFlow package
- kaggle.json: json file containing output from TFRS package
- targets.csv: Target variable file exported as csv from Feature Engineering notebook
We attempted to upload the features.csv file but github refused given its size of 4.3 GB.