This repo is for paper Leveraging graph-based hierarchical medical entity embedding for healthcare applications published in Scientific Reports
Tong Wu1, Yunlong Wang1*, Yue Wang1, Emily Zhao1, Yilian Yuan1
1 Advanced Analytics, IQVIA Inc., Plymouth Meeting, Pennsylvania, USA
The eICU Collaborative Research Database
Once obtaining the permission, download patient.csv
, admissionDx.csv
, diagnosis.csv
, treatment.csv
, and carePlanCareProvider.csv
to the folder saved_data
(if the folder does not exist, create it).
tqdm==4.51.0
gensim==3.8.3
matplotlib==3.3.3
networkx==2.5
numpy==1.19.5
pandas==1.2.0
scikit_learn==0.24.0
stellargraph==1.2.1
torch==1.7.1
The implementation of node2vec is from aditya-grover/node2vec. It should be downloaded to the repo folder with the name node2vec
.
To generate service embedding, run
python experiments/prepare_svc_embedding.py
python node2vec/src/main.py --input graph/ppd_eICU.edgelist --output emb/ppd_eICU.emd --dimensions 128 --walk-length 100 --num-walks 10 --window-size 20 --iter 150 --workers 8 --p 4 --q 1
To generate doctor embedding, run
python experiments/prepare_doc_embedding.py
python experiments/doc_emb_train.py
To generate patient embedding, run
python experiments/prepare_pat_embedding.py
python experiments/pat_emb_train.py
Run the follwing script first:
python experiments/prepare_baseline_embedding.py
node2vec:
python node2vec/src/main.py --input graph/baseline_node2vec.edgelist --output emb/baseline_node2vec_emb.emd --dimensions 128 --walk-length 100 --num-walks 10 --window-size 20 --iter 150 --workers 8 --p 4 --q 1
LINE:
python experiments/line_emb_train.py
metapath2vec:
python experiments/metapath2vec_emb_train.py --input saved_data/baseline/graph_metapath.pkl --output saved_data/baseline/baseline_emb_metapath2vec.emd
For nonnegative matrix factorization (NMF) and spectral clustering (SC), their embeddings are generated in the experiment code.
python experiments/readmission_prediction.py
python experiments/sequential_learning_finetune.py