/Implementing_Different_Trees_Models_on_US_Companies_Data

In this project I implemented decision tree, bagged tree, random forest and XGBoost for comparison of better MAE performance between Trees Algorithms.

Primary LanguageJupyter Notebook

Implementing_Different_Trees_Models_on_US_Companies_Data

In this project I implemented decision tree, bagged tree, random forest and XGBoost for comparison of better MAE performance between Trees Algorithms.

We Try to answer these questions:

  1. Use the data USCompaniesdata.dta. Create a training set containing half of the observations, and a test set containing the remaining observations. Fit a tree with Return on Assets (roa_w) as the response and the other variables as predictors.
  2. Apply bagging to USCompaniesdata.dta. Compare the MSE of the tree in Exercise 4 with the MSE of the bagged trees.
  3. Apply random forests to USCompaniesdata.dta. Does random forests provide an improvement over the bagged trees in Exercise 5?
  4. Apply boosting to USCompaniesdata.dta. Which variables are the most important predictors in the boosted model?