/medical_mae

The official implementation of "Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification"

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Medical Masked Autoencoders

Paper

This repository provides the official implementation of training Vision Transformers (ViT) for (2D) medical imaging tasks as well as the usage of the pre-trained ViTs in the following paper:

Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao, Yutong Bai, Alan Yuille, Zongwei Zhou
Johns Hopkins University
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023
paper | code

TO DO

  • Instructions for preparing datasets.
  • Instructions for pretraining and fine-tuning.

Image reconstruction demo

Installing Requirements

Our codebase follows the MAE Official and uses some additional packages. You may use one of the following commands to build environments with Conda and Pip.

Conda:

conda create -n medical_mae -f medical_mae.yml 

Pip:

conda create -n medical_mae python=3.8
conda activate medical_mae
pip install -r requirements.txt 

Preparing Datasets:

The MIMIC-CXR, CheXpert, and ChestX-ray14 datasets are public available on their official sites. You can download or request the access to them under the agreements.

You may also download them through the following links for research only and follow the official agreements.

MIMIC-CXR (JPG): https://physionet.org/content/mimic-cxr-jpg/2.0.0/

CheXpert (v1.0-small): https://www.kaggle.com/datasets/ashery/chexpert

ChestX-ray14 : https://www.kaggle.com/datasets/nih-chest-xrays/data

Pre-training on ImageNet or Chest X-rays

The pre-training instruction is in PRETRAIN.md.

Fine-tuning with pre-trained checkpoints

The fine-tuning instruction is in FINETUNE.md.

The following table provides the pre-trained checkpoints used in Table 1:

You can download all the weights in the following table with this link (google drive).

Model Pretrained Dataset Method Pretrained Finetuned (NIH Chest X-ray) mAUC
DenseNet-121 ImageNet Categorization torchvision official google drive 82.2
ResNet-50 ImageNet MoCo v2 google drive google drive 80.9
ResNet-50 ImageNet BYOL google drive google drive 81.0
ResNet-50 ImageNet SwAV google drive google drive 81.5
DenseNet-121 X-rays (0.3M) MoCo v2 google drive google drive 80.6
DenseNet 121 X-rays (0.3M) MAE google drive google drive 81.2
ViT-Small/16 ImageNet Categorization DeiT Official google drive 79.6
ViT-Small/16 ImageNet MAE google drive google drive 78.6
ViT-Small/16 X-rays (0.3M) MAE google drive google drive 82.3
ViT-Base/16 X-rays (0.5M) MAE google drive google drive 83.0
Model Pretrained Dataset Finetuned (Chest X-ray) mAUC Finetuned (CheXpert) mAUC Finetuned (COVIDx) Accuracy
ViT-Small/16 X-rays (0.3M) google drive 82.3 google drive 89.2 google drive 95.2
ViT-Base/16 X-rays (0.5M) google drive 83.0 google drive 89.3 google drive 95.3

Citation

If you use this code or use our pre-trained weights for your research, please cite our papers:

@inproceedings{xiao2023delving,
  title={Delving into masked autoencoders for multi-label thorax disease classification},
  author={Xiao, Junfei and Bai, Yutong and Yuille, Alan and Zhou, Zongwei},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3588--3600},
  year={2023}
}

License

This repo is under Apache 2.0 license.

Acknowledgement

This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research.

Our code is built upon facebookresearch/mae.