Kernel Methods for Machine Learning

The goal of the project is to perform a classification task using kernel methods. We want to predict whether a given DNA sequence is bounded or unbounded for a transcription factor of interest.
Three data sets are given, corresponding to three different transcription factors. Each transcription factor describe a new classification task, then, three different predictive models should be implemented.

Learning algorithms are implemented "from scratch", only using optimization librairies.

Here is the structure of the code :

”models.py” implements learning algorithms which are SVM and KRR.
”kernels.py” implements kernels which are Gaussian Kernel, Polynomial Kernel and MismatchKernel (generalizing Spectrum Kernel).
”tuning.py” is used for hyperparameters tuning. For thispurpose, the annotated data is splitted into training and validation sets. Measuring performanceson both sets allow us to detect underfitting and overfitting.
”tools.py” implements functions for datareading and data augmentation.
”main.py” implements the competition submission.

To reproduce the submission, the script ”start.sh” can be called from Python.

lucas-morin/kernelmethods

Kernel Methods for Machine Learning