An End-to-End Graph Neural Network for Disease Gene Prioritization.
We provide a Dockerfile to setup a runtime. To use it run
docker build -t gNetDGP .
conda env create -f environment.yml
conda activate gnetdgp_env
To get an overview of available commands use
python main.py --help
To list the available options on a specific command use
python main.py [COMMAND] --help
To train a new generic model use
python main.py generic-train --training_data_path ./data/training/genes_diseases.tsv
For available options run
python main.py generic-train --help
Provide an input file of gene, disease tuples like in the test/example_input_generic.tsv
Then run the command
python main.py generic-predict test/example_input_generi.tsv
This will score the provided disease, gene tuples and return a augmented version of the input file with added scores.
The result is stored in --out_file
, the default is ./generic_predict_results.tsv
The result is sorted by the predicted score by default.
If you want to preserve the input order add the option --sort_result_by_score False
To use a specific pre-trained model for the prediction add the option --model_path /path/to/model.ptm
.
To get a list of available genes in the model run
python main.py generic-predict --get_available_genes
To get a list of available diseases in the model run
python main.py generic-predict --get_available_diseases
To train the specific model run
main.py specific-train --training_disease_genes_path ./data/training/genes_diseases.tsv --training_disease_class_assignments_path ./data/training/extracted_disease_class_assignments.tsv
For available options run
python main.py specific-train --help
To predict disease scores for the specific mode, provide a input_file
with entrez gene IDs (one per line) like e.g.
in test/example_input_specific.tsv. Use the --model_path
flag
to provide a pretrained specific model.
main.py specific-predict ./test/example_input_specific.tsv --model_path /path/to/choosen/pretrained/specific/model.ptm
This will score the provided gene IDs to be associated with the disease the model was trained on and return a augmented version of the input file with added scores.
The result is stored in --out_file
, the default is ./generic_predict_results.tsv
The result is sorted by the predicted score by default.
If you want to preserve the input order add the option --sort_result_by_score False
The process of the reported experiments is documented in