/PeFoMed

The code for paper: PeFoM-Med: Parameter Efficient Fine-tuning on Multi-modal Large Language Models for Medical Visual Question Answering

Primary LanguagePython

PeFoMed

This is the official implementation of PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging.

Figure 1: Overview of the PeFoMed.

Datasets

The configuration of all datasets needs to be set in the corresponding dataset configuration file in the pefomed/configs/datasets/medical

Stage 1 finetune datasets: ROCO, CLEF2022, MEDICAT, and MIMIC-CXR

Stage 2 finetune medical VQA datasets: VQA-RAD, PathVQA and Slake.

Stage 2 finetune MRG dataset: IU-Xray

Acknowledgement

If you're using PeFoMed in your research or applications, please cite using this BibTeX:

@misc{liu2024pefomedparameterefficientfinetuning,
      title={PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging}, 
      author={Gang Liu and Jinlong He and Pengfei Li and Genrong He and Zhaolin Chen and Shenjun Zhong},
      year={2024},
      eprint={2401.02797},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2401.02797}, 
}

License

This repository is under BSD 3-Clause License.

Many codes are based on Lavis and MiniGPT-v2