ML_AI_HW_2

Machine Learning and Artificial Intelligence - Homework #2 - A.Y. 2018/2019 - Politecnico di Torino

Load Iris dataset
Simply select the first two dimensions (let’s skip PCA this time)
Randomly split data into train, validation and test sets in proportion 5:2:3
For C from 10^(-3) to 10^3: (multiplying at each step by 10)
- Train a linear SVM on the training set.
- Plot the data and the decision boundaries
- Evaluate the method on the validation set
Plot a graph showing how the accuracy on the validation set varies when changing C
How do the boundaries change? Why?
Use the best value of C and evaluate the model on the test set. How well does it go?

Repeat point 4. (train, plot, etc..), but this time use an RBF kernel
Evaluate the best C on the test set.
Are there any differences compared to the linear kernel? How are the boundaries different?
Perform a grid search of the best parameters for an RBF kernel: we will now tune both gamma and C at the same time. Select an appropriate range for both parameters. Train the model and score it on the validation set.
Show the table showing how these parameters score on the validation set.
Evaluate the best parameters on the test set. Plot the decision boundaries.

Merge the training and validation split. You should now have 70% training and 30% test data.
Repeat the grid search for gamma and C but this time perform 5-fold validation.
Evaluate the parameters on the test set. Is the final score different? Why?

Marlowess/ML_AI_HW_2