First, apply decision tree. second, using Bagging to generate a set of bootstrap datasets, create estimators for each bootstrap dataset, and finally utilize majority voting (soft or hard) to get the final decision.Thirs, useing GradientBoosting classifier to classify test set samples. There are 2 important hyperparameters in GradientBoosting, i.e., the number of estimators, learning rate tune each one of them then display accuracy and Confusion Matrix separately for the best value of both parameters
using http://archive.ics.uci.edu/ml/datasets/penbased+recognition+of+handwritten+digits
Data Set Information: create a digit database by collecting 250 samples from 44 writers. The samples written by 30 writers are used for training, cross-validation and writer dependent testing, and the digits written by the other 14 are used for writer independent testing
Attribute Information about the data:
- All input attributes are integers in the range 0..100.
- The last attribute is the class code 0..9
- DT
- Bagging
- GradientBoosting