/mit-tmle-sepsis

Primary LanguageRMIT LicenseMIT

Disparities in Use of Interventions across Races in ICU Sepsis Patients

Many interventions in healthcare are still not based on hard evidence and care might differ between races, especially in the Intensive Care Unit (ICU).

The goal of this project is to investigate disparities between races in critically ill sepsis patients in regard to in-hospital mortality, renal replacement therapy (RRT), vasopressor use (VP), or mechanical ventilation (MV) in cohorts curated from MIMIC IV (2008-2019).

How to run this project

1. Clone this repository

Run the following command in your terminal.

git clone https://github.com/joamats/mit-tmle.git

2. Install required Packages

R scripts

Run the following command:

source('src\r_scripts\setup\install_packages.R')

Python scripts

Run the following command:

pip install -r src/py_scripts/setup/requirements.txt

3. Fetch the data

MIMIC data can be found in PhysioNet, a repository of freely-available medical research data, managed by the MIT Laboratory for Computational Physiology. Due to its sensitive nature, credentialing is required to access both datasets.

Documentation for MIMIC-IV's can be found here.

Integration with Google Cloud Platform (GCP)

In this section, we explain how to set up GCP and your environment in order to run SQL queries through GCP right from your local Python setting. Follow these steps:

  1. Create a Google account if you don't have one and go to Google Cloud Platform
  2. Enable the BigQuery API
  3. Create a Service Account, where you can download your JSON keys
  4. Place your JSON keys in the parent folder (for example) of your project
  5. Create a .env file with the command cp env.example env
  6. Update your .env file with your JSON keys path and the id of your project in BigQuery

MIMIC-IV

After getting credentialing at PhysioNet, you must sign the data use agreement and connect the database with GCP, either asking for permission or uploading the data to your project.

Having all the necessary tables for the cohort generation query in your project, run the following command to fetch the data as a dataframe that will be saved as CSV in your local project. Make sure you have all required files and folders.

python3 src/py_scripts/get_data.py --sql "src/sql_queries/mimic_table.sql" --destination "data/MIMIC_data.csv"

And transform into a ready to use dataframe:

source("src/r_scripts/utils/load_data.R")

The ICD-9 to ICD-10 translation based on this GitHub Repo.

4. Run the different analysis

4.1 TMLE

Targetted Maximum Likelihood Estimation was used to delineate the average treatment effect for one of the interventions. Data was stratified by race and predicted probability of mortality based on the OASIS score. Running the following command allows to replicate the obtained results.

source("src/r_scripts/tmle_bin.R")
# for binary outcomes

source("src/r_scripts/tmle_cont.R")
# for continuous outcomes

How to contribute

We are actively working on this project. Feel free to raise questions opening an issue, to fork this project and submit pull requests!