variational-proteins: A Jupyter Notebook repository from AndreeaMusat

🔬 Variational Proteins Starter Pack 🎒

This is a starter pack for the Variational Proteins project in course 02460 - Advanced Machine Learning at DTU Compute (Spring 2021). It includes the datasets and some boring boilerplate code to help with loading and parsing (misc.py).

We also provide a very simple vanilla VAE (vae.py) and a training/eval loop (train.py) to get you started. If you need to brush up on your Variational Autoencoders, check out week 7 of 02456 - Deep Learning.

🚋 Training

To train the included toy VAE and see all components in action:

python train.py

On training completion a file trained.model.pth will be created - it will include training progress, model parameters and other stuff ready to be explored by the notebook.ipynb jupyter notebook.

❓ What's next

When you are comfortable with the data and the problem, consider working on the following ideas:

Ideas from the paper

Group Sparsity Prior (Limit the influence of neurons to a small number of positions)
Bayesian Learning (Prevent overfitting and achieve an "ensambling" effect)
Sequence weighting (Fix overrepresentation in the dataset)

Other ideas

Different VAE architecture (eg. Hierarchical VAE)
Compare to a GPLVM model
Bayesian Optimization in the latent space.

AndreeaMusat/variational-proteins

🔬 Variational Proteins Starter Pack 🎒

🚋 Training

❓ What's next

Ideas from the paper

Other ideas