In this project I implemented decision tree, bagged tree, random forest and XGBoost for comparison of better MAE performance between Trees Algorithms.
- Use the data USCompaniesdata.dta. Create a training set containing half of the observations, and a test set containing the remaining observations. Fit a tree with Return on Assets (roa_w) as the response and the other variables as predictors.
- Apply bagging to USCompaniesdata.dta. Compare the MSE of the tree in Exercise 4 with the MSE of the bagged trees.
- Apply random forests to USCompaniesdata.dta. Does random forests provide an improvement over the bagged trees in Exercise 5?
- Apply boosting to USCompaniesdata.dta. Which variables are the most important predictors in the boosted model?