/PanDerm

A General-Purpose Multimodal Foundation Model for Dermatology

Primary LanguageJupyter Notebook

PanDerm

A General-Purpose Multimodal Foundation Model for Dermatology [Arxiv Paper] [Cite]

Introduction

[abstract] Diagnosing and treating skin diseases require advanced visual skills across multiple domains and the ability to synthesize information from various imaging modalities. Current deep learning models, while effective at specific tasks such as diagnosing skin cancer from dermoscopic images, fall short in addressing the complex, multimodal demands of clinical practice. Here, we introduce PanDerm, a multimodal dermatology foundation model pretrained through self-supervised learning on a dataset of over 2 million real-world images of skin diseases, sourced from 11 clinical institutions across 4 imaging modalities. We evaluated PanDerm on 28 diverse datasets covering a range of clinical tasks, including skin cancer screening, phenotype assessment, and risk stratification, diagnosis of neoplastic and inflammatory skin diseases, skin lesion segmentation, change monitoring, and metastasis prediction and prognosis. PanDerm achieved state-of-the-art performance across all evaluated tasks, often outperforming existing models even when using only 5-10% of labeled data. PanDerm’s clinical utility was demonstrated through reader studies in real-world clinical settings across multiple imaging modalities. It outperformed clinicians by 10.2% in early-stage melanoma detection accuracy and enhanced clinicians’ multiclass skin cancer diagnostic accuracy by 11% in a collaborative human-AI setting. Additionally, PanDerm demonstrated robust performance across diverse demographic factors, including different body locations, age groups, genders, and skin tones. The strong results in benchmark evaluations and real-world clinical scenarios suggest that PanDerm could enhance the management of skin diseases and serve as a model for developing multimodal foundation models in other medical specialties, potentially accelerating the integration of AI support in healthcare. alt text

Installation

First, clone the repo and cd into the directory:

git clone https://github.com/SiyuanYan1/PanDerm
cd PanDerm

Then create a conda env and install the dependencies:

conda create -n PanDerm python=3.10 -y
conda activate PanDerm
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

1. Download PanDerm Pre-trained Weights

Obtaining the Model Weights

Download the pre-trained model weights from this Google Drive link.

Configuring the Model Path

After downloading, you need to update the model weights path in the code:

  1. Open the file PanDerm/linear_probe/models/builder.py
  2. Locate line 42
  3. Replace the existing path with the directory where you saved the model weights:
root_path = '/path/to/your/PanDerm/Model_Weights/'

2. Data Organization for Classification

We've pre-processed the public datasets used in this study. To reproduce the results reported in our paper and prevent data leakage between splits, please use these processed datasets.

If you wish to use our model with your own dataset, please organize it in the same format as these pre-processed datasets.

Public Dataset Links and Splits

Dataset Processed Data Original Data
HAM10000 Download Official Website
BCN20000 Download Official Website
DDI Download Official Website
Derm7pt Download Official Website
Dermnet Download Official Website
HIBA Download Official Website
MSKCC Download Official Website
PAD-UFES Download Official Website
PATCH16 Download Official Website

Note: The processed datasets may differ slightly from those provided on the official websites. To ensure reproducibility of our paper's results, please use the processed data links provided above.

3. Linear Evaluation on Downstream Tasks

Training and evaluation using HAM10000 as an example. Replace csv path and root path with your own dataset.

Key Parameters

  • nb_classes: Set this to the number of classes in your evaluation dataset.
  • batch_size: Adjust based on the memory size of your GPU.
  • percent_data: Controls the percentage of training data used. For example, 0.1 means evaluate models using 10% training data. Modify this if you want to conduct label efficiency generalization experiments.

Evaluation Command

cd linear_probe
CUDA_VISIBLE_DEVICES=0 python linear_eval.py \
  --batch_size 1000 \
  --model 'PanDerm' \
  --nb_classes 7 \
  --percent_data 1.0 \
  --csv_filename 'PanDerm_results.csv' \
  --output_dir "/path/to/your/PanDerm/LP_Eval/output_dir2/ID_Res/PanDerm_res/" \
  --csv_path "/path/to/your/PanDerm/Evaluation_datasets/HAM10000_clean/ISIC2018_splits/HAM_clean.csv" \
  --root_path "/path/to/your/PanDerm/Evaluation_datasets/HAM10000_clean/ISIC2018/"

More Usage Cases

For additional evaluation datasets, please refer to the bash scripts for detailed usage. We provide running code to evaluate on 9 public datasets. You can choose the model from the available options.

To run the evaluations:

cd linear_probe
bash script/lp.sh

Starter Code for Beginners: Loading and Using Our Model

Check out our easy-to-follow Jupyter Notebook:

HAM_clean_evaluation.ipynb

This notebook shows you how to:

  • Load our pre-trained model
  • Use it for feature extraction
  • Perform basic classification

4. Skin Lesion Segmentation

Please refer to details here.

Citation

@misc{yan2024generalpurposemultimodalfoundationmodel,
      title={A General-Purpose Multimodal Foundation Model for Dermatology}, 
      author={Siyuan Yan and Zhen Yu and Clare Primiero and Cristina Vico-Alonso and Zhonghua Wang and Litao Yang and Philipp Tschandl and Ming Hu and Gin Tan and Vincent Tang and Aik Beng Ng and David Powell and Paul Bonnington and Simon See and Monika Janda and Victoria Mar and Harald Kittler and H. Peter Soyer and Zongyuan Ge},
      year={2024},
      eprint={2410.15038},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2410.15038}, 
}