Predicts chronic diseases using a patient's previous history.
Checkout the website: http://naresh1318.pythonanywhere.com/
-
Get access to the MIMIC 3 Database.
-
Download the CSV files to a local directory.
-
Install postgres SQL and follow these steps to setup the database.
-
Install the following dependencies :
Note : The code was written using python 3 not guaranteed to work on python 2.
- tensorflow
- tflearn
- argparse
- numpy
- sklearn
- gensim
- scipy
- nltk
- pandas
- matplotlib
- seaborn
- psycopg2
- Flask
- gensim
Install virtualenv:
pip install virtualenv
pip install virtualenvwrapper
export WORKON_HOME=~/Envs
source /usr/local/bin/virtualenvwrapper.sh
Create a virtual environment and install the dependencies:
mkvirtualenv --python=/usr/bin/python3.5 tf
workon tf
pip3 install -r requirements.txt
Note:
- All the above steps must be execute from the DiagnosisPredictor directory.
- Install tensorflow version r.10 follow this guide.
- It is recommended that you install GPU version of tensorflow if you don't want to wait for days for all the models to be trained.
- Install tflearn after installing tensorflow.
pip3 install tflearn==0.2.1
cd Data_Preparation
psql -U mimic -a -f allevents.sql
python3 generate_icd_levels.py
python3 generate_seq_combined.py
python3 generate_vector_tfidf.py
Note : Try sudo psql -U mimic -a -f allevents.sql
if permission is denied.
This generate a CSV file Data/mimic_diagnosis_tfidf/diagnosis_tfidf_5645_pat.csv
which contains 1471 columns.
The first 1391 columns contains the tfidf representation for each patient sequence. The last 80 columns contains the 80 chronic diagnosis
we are trying to predict.
Running the decision tree predictor:
cd ../Predictor_Tfidf
python3 decision_tree.py
The results are stored at Results_tfidf/Random_Forest
.
ROC curve for random forest predictor looks something like this :
Similarly other algorithms such as fully connected network(Multilayer layer perceptron) can be run as follows :
python3 dense_fully_connected.py
The results are stored at Results_tfidf/Dense_fully_connected
.
ROC curve for random forest predictor looks something like this :
Loads the saved models from dense fully connected model and make predictions.
cd Project_Website
python3 app.py