:microscope: Variational Proteins Starter Pack :school_satchel:
This is a starter pack for the Variational Proteins project in course 02460 - Advanced Machine Learning at DTU Compute (Spring 2021). It includes the datasets and some boring boilerplate code to help with loading and parsing (misc.py
).
We also provide a very simple vanilla VAE (vae.py
) and a training/eval loop (train.py
) to get you started. If you need to brush up on your Variational Autoencoders, check out week 7 of 02456 - Deep Learning.
🚋 Training
To train the included toy VAE and see all components in action:
python train.py
On training completion a file trained.model.pth
will be created - it will include training progress,
model parameters and other stuff ready to be explored by the notebook.ipynb
jupyter notebook.
❓ What's next
When you are comfortable with the data and the problem, consider working on the following ideas:
paper
Ideas from the- Group Sparsity Prior (Limit the influence of neurons to a small number of positions)
- Bayesian Learning (Prevent overfitting and achieve an "ensambling" effect)
- Sequence weighting (Fix overrepresentation in the dataset)
Other ideas
- Different VAE architecture (eg. Hierarchical VAE)
- Compare to a GPLVM model
- Bayesian Optimization in the latent space.