Transcription factors (TFs) are regulatory proteins that bind specific sequence motifs in the genome to activate or repress transcription of target genes. Genome-wide protein-DNA binding maps can be profiled using some experimental techniques and thus all genomics can be classified into two classes for a TF of interest: bound or unbound. In this challenge, we will work with three datasets corresponding to three different TFs.
Repo is organized as follows :
├── data
├── demo
├── docs
│ ├── svg
│ └── tex
├── src
│ ├── classifiers
│ ├── decomposition
│ ├── evaluation
│ └── kernels
└── utils
data
: contains provided datasets (Xtr[012].csv
,Ytr[012].csv
andXte[012].csv
) as well as precomputed kernelsdemo
: demonstration notebooksdoc
: images used in repository (/svg
) and project report (/tex
)src
:classifiers
: kernel-based classifiers such as Kernel Logistic Regression or Kernel SVMdecomposition
: kernel-based matrix decomposition algorithm (so far only Kernel PCA)evaluation
: evaluation metrics and model selection scriptskernels
: various kernels implementation for DNA sequence comparison, see wiki
Submission file is stored under submission.csv
and can be reproduced by running python train.py
- Team name : Kernelito
Score (acc) | Ranking | |
---|---|---|
Public | 0.72066 | 10 |
Private | 0.69733 | 15 |