/amlp_book_code

Approaching(Almost) any ML problem books code for practice

Primary LanguageJupyter Notebook

Approaching (Almost) Any ML problem book's code for practice

Practice code from above book.

  1. Supervised and unsupervised learning
  2. Cross validation
  3. Evaluation metrics
  4. Project structure for any ML project
  5. Approaching categorical variables
    1. OneHot encoding + Logistic Regression model
      This gives us AUC score of ~0.78 which is good. As the AUC score is in range of 0-1 and 1 being the perfect model.
    2. LabelEncoding
      1. Random Forest model
        • This gives us AUC score of ~0.71 which is worse than Logistic regression model.
        • This model also takes more time and space compared to Logistic regression model.
        • This implies that we should never ignore basic model when training for the problem.
      2. XGBoost model
        • This gives us AUC score of ~0.76 which is better than RandomForest model, but still not better than Logistic regression model.
        • This model also takes more time and space compared to Logistic regression and RandomForest models.