/Hubble.2d6-paper-implementation

Implementation of Hubble.2D6 based on the original paper - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660895/

Primary LanguageJupyter Notebook

Implementing Hubble.2d6

Link to project repo: https://github.com/Locrian24/seng474-term-project

This repo contains the report, poster, and source code for an attempted implementation of the Hubble.2D6 tool based on the original paper. This was a term project for the SENG 474 class at the University of Victoria.

IMPORTANT: This is an naive and incomplete implementation of the Hubble.2d6 tool. This was an undergraduate project and is far from a reliable tool. The official version of Hubble.2d6 can be found at this repo.

Acknowledgements

This majority of logic in pre-processing and post-processing of data is taken from the original tool (here). This also includes supplementary data such as pre-computed embeddings, and specifics in the deep learning networks' architecture.

How to run

Creating environment

Using Anaconda:

conda env create --file cannett_474_env.yml

Using pip with a virtual environment:

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt

Run

Sample file with 3 star alleles:

python3 model/hubble.py -v data/sample.vcf

Star alleles from PharmVar:

python3 model/hubble.py -v step3/data/star_samples.vcf

Supplementary material

As well as the source code for the implementation, Google Colab notebooks are included showing the training processes as well as generation of evaluation metrics.

Colab notebooks are found in the notebooks directory.