The goal of this project is work on the theoretical fundamentals of machine learning with different exercices.
We had to work on the Bayes predictor and the Bayes risk associated to some particular settings. Another part is dedicated to the OLS estimator.
In this part, we had to perform a regression on the dataset stored in data/regression/
.
- The inputs x are stored in inputs.npy.
- The labels y are stored in labels.npy
We were free to choose the regression method. In the report, you can retrieve the explanation and the discussion of our approach. For instance, you could find : — the performance of several methods that we tried. — the choice of the hyperparameters and the method to choose them. — the optimization method
Here are the benchmark results :
For this exercice, we had to perform classification on a given dataset, with our inputs stored in a inputs.npy
file and our labels in a labels.npy
file as in the last part.
We were again free to choose our classifier model and free to implement what we wanted, so we decided to test several algorithms exposed by the Sklearn Api and compare them.
We decided to split our dataset to have a test dataset and a train dataset, to run these algorithms with default hyper-parameters and to evaluate them with their accuracy. The objective was to obtain an accuracy superior than 0.85 on the test subset. With our basic implementation, we had great results that were superior than 0.85 on several of our tested classifier.
Here are the benchmark results :
alexandre.lemonnier
alexandre.poignant
sarah.gutierez
victor.simonin