An R package for building patient level predictive models using data in Common Data Model format.
- Takes a cohort and outcome of interest as input.
- Extracts the necessary data from a database in OMOP Common Data Model format.
- Uses a large set of covariates including for example all drugs, diagnoses, procedures, as well as age, comorbidity indexes, etc.
- Various machine learning algorithms can be used to develop predictive models.
- Includes function for evaluating predictive models
- Includes functions to plot and explore model performance (ROC + Calibration)
- Supported outcome models are l1 logistic regression, Random forest, Gradient boosting machines, Naive Bayes, KNN and MLP.
Calibration plot | ROC plot |
PatientLevelPrediction is an R package, with some functions implemented in C++ and python.
Requires R (version 3.3.0 or higher). Installation on Windows requires RTools. Libraries used in PatientLevelPrediction require Java and Python.
The python installation is required for some of the machine learning algorithms. We advise to install Python 2.7 using Anaconda (https://www.continuum.io/downloads)
- Cyclops
- DatabaseConnector
- SqlRender
- FeatureExtraction
- BigKnn
- On Windows, make sure RTools is installed.
- The DatabaseConnector and SqlRender packages require Java. Java can be downloaded from http://www.java.com.
- Random forest, Naive Bayes and MLP require python 2.7. Python 2.7 can be downloaded from: https://www.continuum.io/downloads.
- In R, use the following commands to download and install PatientLevelPrediction:
install.packages("drat")
drat::addRepo("OHDSI")
install.packages("PatientLevelPrediction")
Have a look at the video below for a demo of the package.
- Vignette: Building patient-level predictive models
- Developer questions/comments/feedback: OHDSI Forum
- We use the GitHub issue tracker for all bugs/issues/enhancements
PatientLevelPrediction is licensed under Apache License 2.0
PatientLevelPrediction is being developed in R Studio.
Beta
- This project is supported in part through the National Science Foundation grant IIS 1251151.