/Project-3

Salary prediction using census bureau database

Primary LanguageJupyter Notebook

Project-3

Salary prediction using census bureau database

  1. Read dataset from "https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
  2. Explore dataset using pandas
  3. Perform preprocessing (handle missing/duplicate/categorical data)
  4. Check feature importance through random forest classifier
  5. Select n features and then apply different-2 classification models (i.e : logistic regression, decision tree classifier, bagging classifier, random forest classifier)
  6. Select best model (using accuracy_score/ roc_auc_score) and then analyse model performance using a) confusion matrix b) Precision c) Recall d) F1-score e) ROC Curve, AUC
  7. Conclude using all the statistics used on the best model