/CheXzero

This repository contains code to train a self-supervised learning model on chest X-ray images that lack explicit annotations and evaluate this model's performance on pathology-classification tasks.

Primary LanguagePythonMIT LicenseMIT

Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning

Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning, Nat. Biomed. Eng (2022). [Paper]
Ekin Tiu, Ellie Talius, Pujan Patel, Curtis P. Langlotz, Andrew Y. Ng, Pranav Rajpurkar
Tiu, E., Talius, E., Patel, P. et al. Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat. Biomed. Eng (2022). https://doi.org/10.1038/s41551-022-00936-9

Screen Shot 2022-09-15 at 10 57 16 AM

This repository contains code to train a self-supervised learning model on chest X-ray images that lack explicit annotations and evalute this model's performance on pathology-classification tasks.

Main Findings
  1. Automatically detecting pathologies in chest x-rays without explicit annotations: Our method learns directly from the combination of images and unstructured radiology reports, thereby avoiding time-consuming labeling efforts. Our deep learning method is capable of predicting multiple pathologies and differential diagnoses that it had not explicitly seen during training.
  2. Matching radiologist performance on different tasks on an external test set: Our method performed on par with human performance when evaluated on an external validation set (CheXpert) of chest x-ray images labeled for the presence of 14 different conditions by multiple radiologists.
  3. Outperforming approaches that train on explicitly labeled data on an external test set: Using no labels, we outperformed a fully supervised approach (100% of labels) on 3 out of the 8 selected pathologies on a dataset (PadChest) collected in a different country. We further demonstrated high performance (AUC > 0.9) on 14 findings and at least 0.700 on 53 findings out of 107 radiographic findings that the method had not seen during training.

Dependencies

To clone all files:

git clone https://github.com/rajpurkarlab/CheXzero.git

To install Python dependencies:

pip install -r requirements.txt

Data

Training Dataset

  1. Navigate to MIMIC-CXR Database to download the training dataset. Note: in order to gain access to the data, you must be a credentialed user as defined on PhysioNet.
  2. Copy the dataset into the data/ directory.
  3. Run python preprocess_train_data.py
  4. This should preprocess the chest x-ray images into a Hierarchical Data Format (HDF) format used for training stored at data/cxr.h5 and extract the impressions section as text from the corresponding chest x-ray radiology report stored at data/mimic_impressions.csv .

Evaluation Dataset

CheXpert Dataset

The CheXpert dataset consists of chest radiographic examinations from Stanford Hospital, performed between October 2002 and July 2017 in both inpatient and outpatient centers. Population-level characteristics are unavailable for the CheXpert test dataset, as they are used for official evaluation on the CheXpert leaderboard.

The main data (CheXpert data) supporting the results of this study are available at https://aimi.stanford.edu/chexpert-chest-x-rays.

The CheXpert test dataset used for official evaluation is hidden from the public to maintain the integrity of the CheXpert competition.

PadChest Dataset

The PadChest dataset contains chest X-rays that were interpreted by 18 radiologists at the Hospital Universitario de San Juan, Alicante, Spain, from January 2009 to December 2017. The dataset contains 109,931 image studies and 168,861 images. PadChest also contains 206,222 study reports.

The PadChest is publicly available at https://bimcv.cipf.es/bimcv-projects/padchest. Those who would like to use PadChest for experimentation should request access to PadChest at the link.

Model Checkpoints

Model checkpoints of CheXzero pre-trained on MIMIC-CXR are publicly available at the following link. Download files and save them in the ./checkpoints/chexzero_weights directory.

Running Training

Run the following command to perform CheXzero pretraining.

python run_train.py --cxr_filepath "./data/cxr.h5" --txt_filepath "data/mimic_impressions.csv"

Arguments

  • --cxr_filepath Directory to load chest x-ray image data from.
  • --txt_filepath Directory to load radiology report impressions text from.

Use -h flag to see all optional arguments.

Zero-Shot Inference

See the following notebook for an example of how to use CheXzero to perform zero-shot inference on a chest x-ray dataset. The example shows how to output predictions from the model ensemble and evaluate performance of the model if ground truth labels are available.

import zero_shot

# computes predictions for a set of images stored as a np array of probabilities for each pathology
predictions, y_pred_avg = zero_shot.ensemble_models(
    model_paths=model_paths, 
    cxr_filepath=cxr_filepath, 
    cxr_labels=cxr_labels, 
    cxr_pair_template=cxr_pair_template, 
    cache_dir=cache_dir,
)

Arguments

  • model_paths: List[str]: List of paths to all checkpoints to be used in the ensemble. To run on a single model, input a list containing a single path.
  • cxr_filepath: str: Path to images .h5 file
  • cxr_labels: List[str]: List of pathologies to query in each image
  • cxr_pair_templates: Tuple[str, str]: constrasting templates used to query model (see Figure 1 in article for visual explanation).
  • cache_dir: str: Directory to cache predictions of each checkpoint, use to avoid recomputing predictions.

In order to use CheXzero for zero-shot inference, ensure the following requirements are met:

  • All input images must be stored in a single .h5 (Hierarchical Data Format). See the img_to_h5 function in preprocess_padchest.py for an example of how to convert a list of paths to .png files into a valid .h5 file.
  • The ground truth labels must be in a .csv dataframe where rows represent each image sample, and each column represents the binary labels for a particular pathology on each sample.
  • Ensure all model checkpoints are stored in checkpoints/chexzero_weights/, or the model_dir that is specified in the notebook.

Evaluation

Given a numpy array of predictions (obtained from zero-shot inference), and a numpy array of ground truth labels, one can evaluate the performance of the model using the following code:

import zero_shot
import eval

# loads in ground truth labels into memory
test_pred = y_pred_avg
test_true = zero_shot.make_true_labels(cxr_true_labels_path=cxr_true_labels_path, cxr_labels=cxr_labels)

# evaluate model, no bootstrap
cxr_results: pd.DataFrame = eval.evaluate(test_pred, test_true, cxr_labels) # eval on full test datset

# boostrap evaluations for 95% confidence intervals
bootstrap_results: Tuple[pd.DataFrame, pd.DataFrame] = eval.bootstrap(test_pred, test_true, cxr_labels) # (df of results for each bootstrap, df of CI)

# print results with confidence intervals
print(bootstrap_results[1])

The results are represented as a pd.DataFrame which can be saved as a .csv.

Issues

Please open new issue threads specifying the issue with the codebase or report issues directly to ekintiu@stanford.edu.

Citation

Tiu, E., Talius, E., Patel, P. et al. Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat. Biomed. Eng (2022). https://doi.org/10.1038/s41551-022-00936-9

License

The source code for the site is licensed under the MIT license, which you can find in the LICENSE file. Also see NOTICE.md for attributions to third-party sources.