Decision-Trees-using-Scikit-learn

Whenever you go to the bank to deposit some cash money, the cashier places banknotes in a machine that tells whether a banknote is real or not. In the “BankNote_Authentication.csv” you have four features: variance, skew, curtosis and entropy and the class attribute refers to whether or not the banknote is real or forged. Problem 1 [Decision Trees using Scikit-learn]: Use the Banknote Authentication data attached with the assignment to implement the following requirements:

  1. Experiment with a fixed train_test split ratio:Use 25% of the samples for training and the rest for testing. a. Rerun this experiment five times and notice the impact of different random splits of the data into training and test sets. b. Report the sizes and accuracies of these trees in each experiment.
  2. Experiment with a range of train_test split ratio: Measure the impact of training set size on the accuracy and the size of the learned tree. Consider training set sizes in the range [ 30% - 70%] (Start with training data size 30% , 40% .... Until you reach 70%) and for each training set_size : a. run the experiment with five different random seeds. b. calculate mean, maximum and minimum accuracy at each training set_size. c. measure the mean, max and min tree size. d. store your statistics in a report. e. Draw two plots: 1) shows accuracy against training set size and 2) the number of nodes in the final tree against training set size