Medical Vision-Language Model

Dataset structure

root
├── images
│   ├── train
│   └── test
├── annotations
│   ├── train
│   │   └── grounded_diseases_train.json
│   └── test
│       └── grounded_diseases_test.json
└── pretrained_checkpoint
    └── checkpoint_stage3.pth

You may load from the pretrained model checkpoints:

For checkpoint_stage3.pth, you can load from the pretrained model below:

MiniGPT-v2 (after stage-3)
Download

Installation

  • Python == 3.10.13
conda create -n litegpt python=3.10.13
git clone https://github.com/nngocson2002/LVLM-Med.git
cd LVLM-Med
pip install -r requirements.txt

Training

Set the visual encoder

We provide different visual encoders with the following keywords:

  • eva_clip_g
  • pubmed_clip_vit
  • biomed_clip
  • biomed_pubmed_clip

After selecting the visual encoder you want, set it here at Line 7, and here at Line 8.

Set Paths for Training

  • Set the training image path to root/images/train here at Line 5.
  • Set the training annotations path to root/annotations/test/grounded_diseases_train.json here at Line 6.
  • Set the pretrained checkpoint path to root/pretrained_checkpoint/checkpoint_stage3.pth here at Line 9.
  • Set the checkpoint save path here at Line 44.

Set Paths for Evaluation (After Training)

  • Set the evaluation annotations path to root/annotations/test/grounded_diseases_test.json here at Line 27.
  • Set the evaluation image path to root/images/test here at Line 28.
  • Set the evaluation result output path here at Line 38.
  • Set the prompt you want to evaluate the model with here at Line 29.

Run

torchrun --nproc-per-node NUM_GPU train.py\ 
         --cfg-path train_configs/train_vindrcxr.yaml\
         --cfg-eval-path eval_configs/eval_vindrcxr.yaml\
         --eval-dataset vindrcxr_val

Evaluation

If you want to evaluate the model independently instead of during training, follow the step 2 in the Training section, and then run:

torchrun --nproc-per-node NUM_GPU evaluate.py\ 
         --cfg-path eval_configs/eval_vindrcxr.yaml\
         --eval-dataset vindrcxr_val