MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise

Paper [ArXiv]   Dataset[Google Drive]   Code[Github]

by Ruiqi Wu, Chenran Zhang, Jianle Zhang, Yi Zhou, Tao Zhou and Huazhu Fu in MICCAI 2024!

We propose MM-Retinal, a multi-modal dataset that encompasses high-quality image-text pairs collected from professional fundus diagram books.

Moreover, we present a novel Knowledge-enhanced foundational pretraining model based on MM-Retinal which incorporates Fundus Image-Text expertise, called KeepFIT.

🚀 Updates

🌻 Data Collection and Statistics

The four professional fundus diagram books are

Our designed semi-automatic pipeline of dataset construction contains four steps:

  • Step 1: Image-Text Pair Collection
  • Step 2: Image-Text Alignment
  • Step 3: Modality Classification
  • Step 4: Text Cleaning and Bilingual Translation

A six-person team took four weeks to get MM-Retinal completed.

🌈 Download Pre-training Datasets

  • MM-Retinal v1(CFP+FFA+OCT): Current version of MM-Retinal dataset includes 2,169 CFP cases, 1,947 FFA cases and 233 OCT cases. Each case is provided with an image and texts in both English and Chinese.
  • flair(CFP): compiles 37 open-access fundus image datasets covering 96 categories with up to 284,660 images. These datasets provide category-level labels for classification.
  • SynFundus-1M(CFP): is a synthetic dataset with 1 million images for 14 diseases, created by a diffusion model trained on 1.3 million private fundus images.
  • FFA-IR(FFA): provides 10,790 reports along with 1,048,584 images from clinical practice. It includes a schema of 46 categories of lesion and bilingual reports.

🌴 Quick Start

1. Environment

Clone the whole repository and install the dependencies.

  • Python 3.8.18
  • PyTorch 1.13.1
  • cuda 12.0
conda create -n mmretinal python=3.8
conda activate mmretinal

git clone https://github.com/lxirich/MM-Retinal.git
cd MM-Retinal/KeepFIT/KeepFIT-CFP or cd MM-Retinal/KeepFIT/KeepFIT-FFA
pip install -r requirements.txt

2. Training

For color fundus photography (CFP) modality:

  • Define the relative paths for pre-training datasets and dataframes in ./local_data/constants.py.

  • Prepare the pre-training dataset dataframes in ./local_data/prepare_partitions.py.

python main_pretrain.py --epochs 40 --batch_size 24 --num_workers 4

For fundus fluorescein angiography (FFA) modality:

  • directly run our code by the following.
python main.py

3. Evaluation

For color fundus photography (CFP) modality:

  • Define the relative paths for evaluation datasets and dataframes in ./local_data/constants.py.

  • Finetune

    python main_transferability.py --shots_train 80% --shots_test 20% --folds 5 --experiment 08_ODIR200x3 --method lp --domain_knowledge True -- project_features False 
    
  • Few-shot

    python main_transferability.py --shots_train 5 --shots_test 20% --folds 5 --experiment 08_ODIR200x3 --method clipAdapter --domain_knowledge True -- project_features True
    
  • Zero-shot

    python main_transferability.py --shots_train 0% --shots_test 100% --experiment 08_ODIR200x3 --method zero_shot --domain_knowledge True -- project_features True 
    

For fundus fluorescein angiography (FFA) modality:

  • Validation and testing are automatically implemented in each epoch.

🔭 Results

1. Finetune

2. Few-shot and Zero-shot

3. Ablation Study

🎯 Checkpoints

Model Checkpoint
KeepFIT (flair+MM) Link
KeepFIT (50%flair+MM) Link
KeepFIT (FFA-IR+MM) Link
Image_captioning (FFA-IR+MM) Link

💘 Acknowledge

FLAIR -- https://github.com/jusiro/FLAIR

FFA-IR -- https://github.com/mlii0117/FFA-IR

SynFundus-1M -- https://github.com/parap1uie-s/SynFundus-1M

🌟 Citation

If you find this repository useful, please consider citing this paper:

@article{
}

📬 Contact

If you have any question, please feel free to contact ruiqiwu@seu.edu.cn.