The aims of this project is to build an insurance costs prediction model using ensemble learners. The goal of ensemble methods is to combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability (robustness) over a single estimator. For the purpose of this project, we use three different methods, there are:
- Random Forest
- AdaBoost
- Gradient Boosting Tree
Variables included in the dataset :
- charges : individual medical costs billed by health insurance
- age : age of primary beneficiary
- sex : insurance contractor gender, female, male
- bmi : body mass index
- children : number of children covered by health insurance
- smoker : smoking
- region : the beneficiary's residential area in the US, northeast, southeast, southwest, northwest
Dataset source : https://github.com/stedy/Machine-Learning-with-R-datasets