Multi-Label Classification and Visual Highlight of Chest X-ray Images using Neural Networks with Attention Mechanism and Grad-CAM

Marcus Hwai Yik Tan, Xiaohan Tian, Wing Chan, Joshua Ceaser
University of Illinois, Urbana-Champaign

Quick Start

Set ALL_IMAGE_DIR to the folder containing the X-ray images
Set BASE_PATH_LABELS to the folder containing the lists of training, validation and test image file names
Run either of the following notebooks or scripts: t01-multilabel-main-test.ipynb, t01-multilabel-non_image_features-main-test.ipynb, t01-multilabel-main-val.ipynb, t01-multilabel-non_image_features-main-val.ipynb, t01-multilabel-main-test.py, t01-multilabel-non_image_features-main-test.py, t01-multilabel-main-val.py, t01-multilabel-non_image_features-main-val.py

p02-dataset-selection-multilabel.ipynb: This notebook can be skipped since the files containing the lists of selected images for the final report are already included in the "labels" folder. The files are train_val_A.csv, train_A_x.csv (x=1,2,3), val_A_x.csv (x=1,2,3) and test_A.csv. This notebook selects a subset of images for training, validation and test lists. Multiple training/validation splits are generated. The default folder is "labels", where Data_Entry_2017_v2020.csv is also located.
t01-multilabel-main-val.ipynb: train and evaluate model on multiple training, validation splits
t01-multilabel-main-val.py: Python version of t01-multilabel-main-val.ipynb
t01-multilabel-main-test.ipynb: train model on training+validation dataset and evaluate model on a test dataset
t01-multilabel-main-test.py: Python version of t01-multilabel-main-test.ipynb

p02-dataset-add_non_image_features.ipynb: Append non-image features to existing training, validation and test lists generated by p02-dataset-selection-multilabel.ipynb
t01-multilabel-non_image_features-main-val.ipynb: same function as t01-multilabel-main-val.ipynb but with non-image features as additional inputs
t01-multilabel-non_image_features-main-val.py: Python version of t01-multilabel-non_image_features-main-val.ipynb
t01-multilabel-non_image_features-main-test.ipynb: same function as t01-multilabel-main-test.ipynb but with non-image features as additional inputs
t01-multilabel-non_image_features-main-test.py: Python version of t01-multilabel-non_image_features-main-test.ipynb

t01-multilabel-test.py: load a saved model and evaluate on a test dataset.
p02-dataset-stats.ipynb: analysis chest x-ray dataset and draw statistic charts.
pp01-postprocess-performance.ipynb: postprocess the performance stats in the performance directory.

t03-multilabel-heatmap-densenet121-v2.ipynb: Load a saved model and draw heatmap image from given input image. Please note the MODEL_NAME can only be densenet121. A DenseNet-121 model trained on the images in the train_val_A.csv list for 8 epochs is provided in the models folder
t03-multilabel-heatmap-densenet121attA-v2.ipynb: Load a saved model and draw heatmap image from given input image. Please note the MODEL_NAME can only be densenet121attA. A DenseNet-121-attA model trained on the images in the train_val_A.csv list for 8 epochs is provided in the models folder

The following modules are built on the standard DenseNet model from https://github.com/pytorch/vision/blob/release/0.9/torchvision/models/densenet.py
- densenet_models.py: standard DenseNet model and the additional functions to generate the heatmaps for DenseNet-121 only using Grad-CAM
- densenet121attA.py: DenseNet-121-attA model and the functions to generate the heatmaps for that model
- densenet121attB.py: DenseNet-121-attB model
- densenet_models_w_non_image.py: standard DenseNet model + non-image feature model
- densenet121attA_w_non_image.py: DenseNet-121-attA model + non-image feature model
resnet_models.py: standard ResNet model from https://github.com/pytorch/vision/blob/release/0.9/torchvision/models/resnet.py

labels: contains "Data_Entry_2017_v2020.csv" and the lists of training, validation and test subsets of images used in the final report
models: contains two trained models -- DenseNet-121 and DenseNet-121-attA that can be used to generate the heatmaps
heatmaps: output folder for the heatmaps

CPU: AMD Ryzen 5 4600H
GPU: NV GTX 1650 / NV RTX 2060 Max-Q

c5 series
p2 series

python3.7+
pytorch, pytorch vision, PIL, numpy, pandas, scikit-learn, matplotlib, importlib, datetime,time