/CIR

Clinically-Interpretable Radiomics [MICCAI'22, CMPB'21]

Primary LanguagePython


Clinically-Interpretable Radiomics

MICCAI'22 Paper | CMPB'21 Paper | CIRDataset | Annotation Pipeline | Installation | Usage | Docker | Issues

This library serves as a one-stop solution for analyzing datasets using clinically-interpretable radiomics (CIR) in cancer imaging. The primary motivation for this comes from our collaborators in radiology and radiation oncology inquiring about the importance of clinically-reported features in state-of-the-art deep learning malignancy/recurrence/treatment response prediction algorithms. Previous methods have performed such prediction tasks but without robust attribution to any clinically reported/actionable features (see extensive literature on sensitivity of attribution methods to hyperparameters). This motivated us to curate datasets by annotating clinically-reported features at voxel/vertex-level on public datasets (using our CMPB'21 advanced mathmetical algorithms) and relating these to prediction tasks (bypassing the “flaky” attribution schemes). With the release of these comprehensively-annotated datasets, we hope that previous malignancy prediction methods can also validate their explanations and provide clinically-actionable insights. We also provide strong end-to-end baselines for extracting these hard-to-compute clinically-reported features and using these in different prediction tasks.

CIRDataset: A large-scale Dataset for Clinically-Interpretable lung nodule Radiomics and malignancy prediction [MICCAI'22]

Spiculations/lobulations, sharp/curved spikes on the surface of lung nodules, are good predictors of lung cancer malignancy and hence, are routinely assessed and reported by radiologists as part of the standardized Lung-RADS clinical scoring criteria. Given the 3D geometry of the nodule and 2D slice-by-slice assessment by radiologists, manual spiculation/lobulation annotation is a tedious task and thus no public datasets exist to date for probing the importance of these clinically-reported features in the SOTA malignancy prediction algorithms. As part of this paper, we release a large-scale Clinically-Interpretable Radiomics Dataset, CIRDataset, containing 956 radiologist QA/QC'ed spiculation/lobulation annotations on segmented lung nodules from two public datasets, LIDC-IDRI (N=883) and LUNGx (N=73). We also present an end-to-end deep learning model based on multi-class Voxel2Mesh extension to segment nodules (while preserving spikes), classify spikes (sharp/spiculation and curved/lobulation), and perform malignancy prediction. Previous methods have performed malignancy prediction for LIDC and LUNGx datasets but without robust attribution to any clinically reported/actionable features (due to known hyperparameter sensitivity issues with general attribution schemes). With the release of this comprehensively-annotated dataset and end-to-end deep learning baseline, we hope that malignancy prediction methods can validate their explanations, benchmark against our baseline, and provide clinically-actionable insights. Dataset, code, pretrained models, and docker containers to reproduce the pipeline as well as the results in the manuscript are available in this repository.

Dataset

The first CIR dataset, released here, contains almost 1000 radiologist QA/QC’ed spiculation/lobulation annotations (computed using our published LungCancerScreeningRadiomics library [CMPB'21] and QA/QC'ed by a radiologist) on segmented lung nodules for two public datasets, LIDC (with visual radiologist malignancy RM scores for the entire cohort and pathology-proven malignancy PM labels for a subset) and LUNGx (with pathology-proven size-matched benign/malignant nodules to remove the effect of size on malignancy prediction).

Spiculation Quantification Demo

Spikes (spiculation/lobulation) quantification via angle-preserving spherical parameterization CMPB'21. The spikes on the surface of a nodule are detected/segmented using the negative area distortion metric (in other words, spikes on the surface will collapse in area on the spherically-mapped surface whereas non-spike parts will expand in area). This leads to a computationally simple/robust algorithm with zero hyperparameters.

overview_imageClinically-interpretable spiculation/lobulation annotation dataset samples; the first column - input CT image; the second column - overlaid semi-automated/QA/QC'ed contours and superimposed area distortion maps (for quantifying/classifying spikes, computed from spherical parameterization -- see our LungCancerScreeninigRadiomics Library); the third column - 3D mesh model with vertex classifications, red: spiculations, blue: lobulations, white: nodule base.

End-to-End Deep Learning Nodule Segmentation, Spikes' Classification (Spiculation/Lobulation), and Malignancy Prediction Model

We also release our multi-class Voxel2Mesh extension to provide a strong benchmark for end-to-end deep learning lung nodule segmentation, spikes’ classification (lobulation/spiculation), and malignancy prediction; Voxel2Mesh is the only published method to our knowledge that preserves sharp spikes during segmentation and hence its use as our base model. With the release of this comprehensively-annotated dataset, we hope that previous malignancy prediction methods can also validate their explanations/attributions and provide clinically-actionable insights. Users can also generate spiculation/lobulation annotations from scratch for LIDC/LUNGx as well as new datasets using our LungCancerScreeningRadiomics library [CMPB'21].

architecure_imageDepiction of end-to-end deep learning architecture based on multi-class Voxel2Mesh extension. The standard UNet based voxel encoder/decoder (top) extracts features from the input CT volumes while the mesh decoder deforms an initial spherical mesh into increasing finer resolution meshes matching the target shape. The mesh deformation utilizes feature vectors sampled from the voxel decoder through the Learned Neighborhood (LN) Sampling technique and also performs adaptive unpooling with increased vertex counts in high curvature areas. We extend the architecture by introducing extra mesh decoder layers for spiculation and lobulation classification. We also sample vertices (shape features) from the final mesh unpooling layer as input to Fully Connected malignancy prediction network. We optionally add deep voxel-features from the last voxel encoder layer to the malignancy prediction network.

Installation

It is highly recommended to install dependencies in either a python virtual environment or anaconda environment. Instructions for python virtual environment:

python3 -m venv venv
source venv/bin/activate
(venv) pip install torch==1.11.0 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
(venv) pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html
(venv) pip install wandb sklearn scikit-image ipython ninja pandas opencv-python tqdm

Please refer to the this link for the details of pytorch3d installation.

Usage

git clone --recursive git@github.com:nadeemlab/CIR.git

High level usage instructions are detailed below. Detailed instructions at each step, including running pre-trained models, are described in following subsections.

Step 1: Update config.py. You may need to set the path to the dataset and also the directory to save the results. All ready to train/test data is available here.

Step 2: You have to first perform data pre-processing. python data_preprocess.py

Step 3: Now execute python main.py and this will start training the network.

Step 4: Test the trained model. python test.py

Data Pre-processing

Pre-processed data will be saved at the dataset directory.

Step 2.0: Generate nrrd files using LungCancerScreeningRadiomics

  • Lung nodule spiculation data can be generated from the scratch using LungCancerScreeninigRadiomics [CMPB'21] for LIDC-IDRI and LUNGx dataset.

  • Pre-processed data is available here.

   tar xjvf CIRDataset_LCSR.tar.bz2

Step 2.1: Convert isotropic voxel data from LungCancerScreeningRadiomics to 64x64x64 cubic image patch for 3D CNN models (dataset/NoduleDataset.py)

  • Input: Each case consists of four nrrd files (SimpleITK)
    LIDC-IDRI-0001_CT_1-all.nrrd - CT Image
    LIDC-IDRI-0001_CT_1-all-ard.nrrd - Area Distortion Map
    LIDC-IDRI-0001_CT_1-all-label.nrrd - Nodule Segmentation
    LIDC-IDRI-0001_CT_1-all-spikes-label.nrrd - Spike Classification - Spiculation:1, Lobulation: 2, Attachment: 3

  • Output: Each case consists of four npy files (numpy) - 64x64x64 cubic image patch
    LIDC-IDRI-0001_iso0.70_s_0_CT.npy - CT Image
    LIDC-IDRI-0001_iso0.70_s_0_ard.npy - Area Distortion Map
    LIDC-IDRI-0001_iso0.70_s_0_nodule.npy - Nodule Segmentation
    LIDC-IDRI-0001_iso0.70_s_0_spikes.npy - Spike Classification - Spiculation:1, Lobulation: 2, Attachment: 3

  • Pre-processed data is available here.

   tar xjvf CIRDataset_npy_for_cnn.tar.bz2

Step 2.2: Divide datasets into subsets (Training, Validation, Testing), extract surface voxels, and combine voxel data and outcome data (dataset/lidc.py & dataset/lungx.py)

  • Input: Output from the previous step and outcome data
    LIDC.csv - Raiological malignancy (RM) only
    LIDC72.csv - RM and pathoogical malignancy (PM)
    LUNGx.csv - PM only

  • Output: pickle files for each subset
    pre_computed_data_trainig_64_64_64.pickle
    pre_computed_data_validation_64_64_64.pickle (LUNGx does not have this)
    pre_computed_data_testing_64_64_64.pickle

  • Pre-processed data is available here.

   tar xjvf CIRDataset_pickle_for_voxel2mesh.tar.bz2

Running Pre-trained Models

  1. Mesh Only model is available here
    tar xjvf pretrained_model-meshonly.tar.bz2
    python test.py --model_path experiments/MICCAI2022/Experiment_001/trial_1
  1. Mesh+Encoder model is available here
    tar xjvf pretrained_model-mesh+encoder.tar.bz2
    python test.py --model_path experiments/MICCAI2022/Experiment_002/trial_1

Docker

We provide a Dockerfile that can be used to run the models inside a container. First, you need to install the Docker Engine. For using GPU's you also need to install NVIDIA container toolkit. After installing the Docker, you need to follow these steps:

  1. Clone this repository.
  2. To create a docker image from the docker file; from top-level repository directory:
cd docker; ./build.sh
  • Note: You may need to modify lines 1, 9 and 10 of Dockerfile to match your systems' cuda version.
  1. Upon successful docker image creation:
  • Pre-built docker image including data and pre-trained models is available here
docker run --gpus all -it choilab/cir /bin/bash
  1. Then run python3 test.py --model_path experiments/MICCAI2022/Experiment_001/trial_1 or python3 test.py --model_path experiments/MICCAI2022/Experiment_002/trial_1 for testing either of the two pre-trained models.

Reproducibility [MICCAI'22]

The following tables show the expected results of running the pre-trained 'Mesh Only' and 'Mesh+Encoder' models (as reported in the paper).

Table1. Nodule (Class0), spiculation (Class1), and lobulation (Class2) peak classification metrics

Training
Network Chamfer Weighted Symmetric ↓ Jaccard Index ↑
Class0 Class1 Class2 Class0 Class1 Class2
Mesh Only 0.009 0.010 0.013 0.507 0.493 0.430
Mesh+Encoder 0.008 0.009 0.011 0.488 0.456 0.410
Validation
Network Chamfer Weighted Symmetric ↓ Jaccard Index ↑
Class0 Class1 Class2 Class0 Class1 Class2
Mesh Only 0.010 0.011 0.014 0.526 0.502 0.451
Mesh+Encoder 0.014 0.015 0.018 0.488 0.472 0.433
Testing LIDC-PM N=72
Network Chamfer Weighted Symmetric ↓ Jaccard Index ↑
Class0 Class1 Class2 Class0 Class1 Class2
Mesh Only 0.011 0.011 0.014 0.561 0.553 0.510
Mesh+Encoder 0.009 0.010 0.012 0.558 0.541 0.507
Testing LUNGx N=73
Network Chamfer Weighted Symmetric ↓ Jaccard Index ↑
Class0 Class1 Class2 Class0 Class1 Class2
Mesh Only 0.029 0.028 0.030 0.502 0.537 0.545
Mesh+Encoder 0.017 0.017 0.019 0.506 0.523 0.525
 

Table 2. Malignancy prediction metrics.

Training
Network AUC Accuracy Sensitivity Specificity F1
Mesh Only 0.885 80.25 54.84 93.04 65.03
Mesh+Encoder 0.899 80.71 55.76 93.27 65.94
Validation
Network AUC Accuracy Sensitivity Specificity F1
Mesh Only 0.881 80.37 53.06 92.11 61.90
Mesh+Encoder 0.808 75.46 42.86 89.47 51.22
Testing LIDC-PM N=72
Network AUC Accuracy Sensitivity Specificity F1
Mesh Only 0.790 70.83 56.10 90.32 68.66
Mesh+Encoder 0.813 79.17 70.73 90.32 79.45
Testing LUNGx N=73
Network AUC Accuracy Sensitivity Specificity F1
Mesh Only 0.733 68.49 80.56 56.76 71.60
Mesh+Encoder 0.743 65.75 86.11 45.95 71.26

Acknowledgments

Reference

If you find our work useful in your research or if you use parts of this code or the dataset, please cite the following papers:

@article{choi2022cirdataset,
  title={CIRDataset: A large-scale Dataset for Clinically-Interpretable lung nodule Radiomics and malignancy prediction},
  author={Choi, Wookjin and Dahiya, Navdeep and Nadeem, Saad},
  journal={International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)},
  year={2022},
}

@article{choi2021reproducible,
  title={Reproducible and Interpretable Spiculation Quantification for Lung Cancer Screening},
  author={Choi, Wookjin and Nadeem, Saad and Alam, Sadegh R and Deasy, Joseph O and Tannenbaum, Allen and Lu, Wei},
  journal={Computer Methods and Programs in Biomedicine},
  volume={200},
  pages={105839},
  year={2021},
  publisher={Elsevier}
}