/clinical-rule-vetting

Learning clinical-decision rules with interpretable models.

Primary LanguageJupyter NotebookMIT LicenseMIT

⚕️ Interpretable Clinical Decision Rules ⚕️️

Validating and deriving clinical-decision rules. Work-in-progress.

This is a collaborative repository intended to validate and derive clinical-decision rules. We use a unified pipeline across a variety of contributed datasets to vet previous modeling practices for clinical decision rules. Additionally, we hope to externally validate the rules under study here with data from UCSF.

Rule derivation datasets

Dataset Task Size References Processed
iai_pecarn Predict intra-abdominal injury requiring acute intervention before CT 12,044 patients, 203 with IAI-I 📄, 🔗
tbi_pecarn Predict traumatic brain injuries before CT 42,412 patients, 376 with ciTBI 📄, 🔗
csi_pecarn Predict cervical spine injury in children 3,314 patients, 540 with CSI 📄, 🔗
tig_pecarn Predict bacterial/non-bacterial infections in febrile infants from RNA transcriptional biosignatures 279 patients, ? with infection 🔗
exxagerate Predict 30-day mortality for acute exacerbations of chronic obstructive pulmonary disease (AECOPD) 1,696 patients, 17 mortalities 📄, 🔗
heart_disease_uci Predict heart disease presence from basic attributes / screening 920 patients, 509 with heart disease 📄, 🔗

Research paper 📄, Data download link 🔗

Datasets are all tabular (or at least have interpretable input features), reasonably large (e.g. have at least 100 positive and negative cases), and have a binary outcome. For PECARN datasets, please read and agree to the research data use agreement on the PECARN website.

Possible data sources: PECARN datasets | Kaggle datasets | MDCalc | UCI | OpenML | MIMIC | UCSF De-ID Potential specific datasets: Maybe later will expand to other high-stakes datasets (e.g. COMPAS, loan risk).

Contributing checklist

To contribute a new project (e.g. a new dataset + modeling), create a pull request following the steps below. The easiest way to do this is to copy-paste an existing project (e.g. iai_pecarn) into a new folder and then edit that one.

Helpful docs: Collaboration details | Lab writeup | Slides

  • Repo set up
    • Create a fork of this repo (see tutorial on forking/merging here)
    • Install the repo as shown below
    • Select a dataset - once you've selected, open an issue in this repo with the name of the dataset + a brief description so others don't work on the same dataset
    • Assign a project_name to the new project (e.g. iai_pecarn)
  • Data preprocessing
    • Download the raw data into data/{project_name}/raw
      • Don't commit any very large files
    • Copy the template files from rulevetting/projects/iai_pecarn to a new folder rulevetting/projects/{project_name}
      • Rewrite the functions in dataset.py for processing the new dataset (e.g. see the dataset for iai_pecarn)
      • Document any judgement calls you aren't sure about using the dataset.get_judgement_calls_dictionary function
      • Notebooks / helper functions are optional, all files should be within rulevetting/projects/{project_name}
  • Data description
    • Describe each feature in the processed data in a file named data_dictionary.md
    • Summarize the data and the prediction task in a file named readme.md. This should include basic details of data collection (who, how, when, where), why the task is important, and how a clinical decision rule may be used in this context. Should also include your names/affiliations.
  • Modeling
    • Baseline model - implement baseline.py for predicting given a baseline rule (e.g. from the existing paper)
    • New model - implement model_best.py for making predictions using your newly derived best model
  • Lab writeup (see instructions)
    • Save writeup into writeup.pdf + include source files
    • Should contain details on exploratory analysis, modeling, validation, comparisons with baseline, etc.
  • Submitting
    • Ensure that all tests pass by running pytest --project {project_name} from the repo directory
    • Open a pull request and it will be reviewed / merged
  • Reviewing submissions
    • Each pull request will be reviewed by others before being merged

Installation

Note: requires python 3.7 and pytest (for running the automated tests). It is best practice to create a venv or pipenv for this project.

python -m venv rule-env
source rule-env/bin/activate

Then, clone the repo and install the package and its dependencies.

git clone https://github.com/Yu-Group/rule-vetting
cd rule-vetting
pip install -e .

Now run the automatic tests to ensure everything works (warnings are fine as long as all test pass).

pytest --project iai_pecarn

To use with jupyter, might have to add this venv as a jupyter kernel.

python -m ipykernel install --user --name=rule-env

Clinical Trial Datasets

Dataset Task Size References Processed
bronch_pecarn Effectiveness of oral dexamethasone for acute bronchiolitisintra-abdominal injury requiring acute intervention before CT 600 patients, 50% control 📄, 🔗
gastro_pecarn Impact of Emergency Department Probiotic Treatment of Pediatric Gastroenteritis 886 patients, 50% control 📄, 🔗

Research paper 📄, Data download link 🔗

Reference

Background reading
Related packages
Updates
Related open-source collaborations