Awesome CLIP in Medical Imaging

Awesome License: MIT

🔥🔥 This is a collection of awesome articles about CLIP in medical imaging🔥🔥

Citation

@article{zhao2023clip,
  title={CLIP in Medical Imaging: A Comprehensive Survey},
  author={Zihao Zhao and Yuxiao Liu and Han Wu and Yonghao Li and Sheng Wang and Lin Teng and Disheng Liu and  Zhiming Cui and Qian Wang and Dinggang Shen},
  journal={arXiv preprint arXiv:2312.07353},
  year={2023},
}

Overview


Taxonomy of studies focusing on CLIP in the field of medical imaging.

Updates

  • ArXiv preprint release: December 13, 2023
  • Github repo release: December 12, 2023

Dataset Resource

dataset domain image text source language pre-trained CLIP
ROCO multiple 87K 87K research papers En PubMedCLIP
MedICaT multiple 217K 217K research papers En /
PMC-OA multiple 1.6M 1.6M research papers En PMCCLIP
ChiMed-VL multiple 580K 580K research papers En/zh /
FFA-IR fundus 1M 10K medical reports En/zh /
PadChest cxr 160K 109K medical reports Sp /
MIMIC-CXR cxr 377K 227K medical reports En BioViL/BioViL-T
OpenPath histology 208K 208K social media En PLIP
Quilt-1M histology 1M 1M research papers
social media
En QuiltNet

Pre-training

Multi-scale

[MICCAI 2020] Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment
Geeticka Chauhan, Ruizhi Liao, William Wells, Jacob Andreas, Xin Wang, Seth Berkowitz, Steven Horng, Peter Szolovits, Polina Golland
[paper] [code]

[ICCV 2021] GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition
Shih-Cheng Huang, Liyue Shen, Matthew P. Lungren, Serena Yeung
[paper] [code]

[MICCAI 2021] Multimodal Representation Learning via Maximization of Local Mutual Information
Ruizhi Liao, Daniel Moyer, Miriam Cha, Keegan Quigley, Seth Berkowitz, Steven Horng, Polina Golland, and William M. Wells
[paper]

[ECCV 2022] Joint Learning of Localized Representations from Medical Images and Reports
Philip Müller, Georgios Kaissis, Congyu Zou, Daniel Rückert
[paper] [code]

[ECCV 2022] Making the Most of Text Semantics to Improve Biomedical Vision–Language Processing
Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Stephanie Hyland, Maria Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez-Valle, Hoifung Poon, and Ozan Oktay
[paper] [code]

[NeurIPS 2022 Workshop] The Role of Local Alignment and Uniformity in Image-Text Contrastive Learning on Medical Images
Philip Müller, Georgios Kaissis, Daniel Rueckert
[paper]

[MICCAI 2022] Breaking with Fixed Set Pathology Recognition through Report-Guided Contrastive Training
Constantin Seibold, Simon Reiß, M. Saquib Sarfraz, Rainer Stiefelhagen, Jens Kleesiek
[paper]

[MICCAI 2022] Vision-Language Contrastive Learning Approach to Robust Automatic Placenta Analysis Using Photographic Images
Yimu Pan, Alison D. Gernand, Jeffery A. Goldstein, Leena Mithal, Delia Mwinyelle, James Z. Wang
[paper]

[ICLR 2023] Advancing Radiograph Representation Learning with Masked Record Modeling
Hong-Yu Zhou, Chenyu Lian, Liansheng Wang, Yizhou Yu
[paper] [code]

[ICCV 2023] LIMITR: Leveraging Local Information for Medical Image-Text Representation
Gefen Dawidowicz, Elad Hirsch, Ayellet Tal
[paper]

[ICCV 2023] PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Pujin Cheng, Li Lin, Junyan Lyu, Yijin Huang, Wenhan Luo, Xiaoying Tang
[paper] [code]

[MICCAI 2023] Contrastive Masked Image-Text Modeling for Medical Visual Representation Learning
Cheng Chen, Aoxiao Zhong, Dufan Wu, Jie Luo, Quanzheng Li
[paper] [code]

[MICCAI 2023] Enhancing Automatic Placenta Analysis through Distributional Feature Recomposition in Vision-Language Contrastive Learning
Yimu Pan, Tongan Cai, Manas Mehta, Alison D. Gernand, Jeffery A. Goldstein, Leena Mithal, Delia Mwinyelle, Kelly Gallagher, James Z. Wang
[paper]

[MICCAI 2023] MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking
Yutong Xie, Lin Gu, Tatsuya Harada, Jianpeng Zhang, Yong Xia, Qi Wu
[paper] [code]

[MLHC 2023] TIER: Text-Image Entropy Regularization for Medical CLIP-style models
Anil Palepu, Andrew Beam
[paper] [code]

[EMNLP 2023] Fine-grained Medical Vision-Language Representation Learning for Radiology Report Generation
Siyuan Wang, Bo Peng, Yichao Liu, Qi Peng
[paper]

[MedIA 2023] Self-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology
Sangjoon Park, Eun Sun Lee, Kyung Sook Shin, Jeong Eun Lee, Jong Chul Ye
[paper]

[TMM 2023] Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training
Ke Zhang, Yan Yang, Jun Yu, Hanliang Jiang, Jianping Fan, Qingming Huang, Weidong Han
[paper]

[ESA 2023] MITER: Medical Image–TExt joint adaptive pretRaining with multi-level contrastive learning
Chang Shu, Yi Zhu, Xiaochu Tang, Jing Xiao, Youxin Chen, Xiu Li, Qian Zhang, Zheng Lu
[paper] [code]

[arXiv 2023] Local Contrastive Learning for Medical Image Recognition
Syed A. Rizvi, Ruixiang Tang, Xiaoqian Jiang, Xiaotian Ma, Xia Hu
[paper]

[arXiv 2023] G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
Che Liu, Cheng Ouyang, Sibo Cheng, Anand Shah, Wenjia Bai, Rossella Arcucci
[paper]

[arXiv 2023] Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation
Wenting Chen, Xiang Li, Linlin Shen, Yixuan Yuan
[paper]

[arXiv 2024] MLIP: Medical Language-Image Pre-training with Masked Local Representation Learning
Jiarun Liu, Hong-Yu Zhou, Cheng Li, Weijian Huang, Hao Yang, Yong Liang, Shanshan Wang
[paper]

[arXiv 2024] Multimodal self-supervised learning for lesion localization
Hao Yang, Hong-Yu Zhou, Cheng Li, Weijian Huang, Jiarun Liu, Yong Liang, Shanshan Wang
[paper]

[arXiv 2024] Generalizable vision-language pre-training for annotation-free pathology localization
Hao Yang, Hong-Yu Zhou, Cheng Li, Weijian Huang, Jiarun Liu, Shanshan Wang
[paper]


Data-efficient

[EMNLP 2022] MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, Jimeng Sun
[paper] [code]

[NeurIPS 2022] Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
Fuying Wang, Yuyin Zhou, Shujun Wang, Varut Vardhanabhuti, Lequan Yu
[paper] [code]

[ISBRA 2023] TCSA: A Text-Guided Cross-View Medical Semantic Alignment Framework for Adaptive Multi-view Visual Representation Learning
Hongyang Lei, Huazhen Huang, Bokai Yang, Guosheng Cui, Ruxin Wang, Dan Wu , and Ye Li
[paper]

[CVPR 2023] Learning to Exploit Temporal Structure for Biomedical Vision–Language Processing
Shruthi Bannur,∗ Stephanie Hyland∗, Qianchu Liu, Fernando P ́ erez-Garc ́ıa, Maximilian Ilse, Daniel C. Castro, Benedikt Boecking, Harshita Sharma, Kenza Bouzid, Anja Thieme, Anton Schwaighofer, Maria Wetscherek, Matthew P. Lungren, Aditya Nori Javier Alvarez-Valle, Ozan Oktay
[paper] [code]

[MICCAI 2023] CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training
Kihyun You, Jawook Gu, Jiyeon Ham, Beomhee Park, Jiho Kim, Eun K. Hong, Woonhyuk Baek, Byungseok Roh
[paper] [code]

[TMI 2023] Improving Medical Vision-Language Contrastive Pretraining with Semantics-aware Triage
Bo Liu, Donghuan Lu, Dong Wei, Xian Wu, Yan Wang, Yu Zhang, Yefeng Zheng
[paper]

[QIMS 2023] SDA-CLIP: surgical visual domain adaptation using video and text labels
Yuchong Li, Shuangfu Jia, Guangbi Song, Ping Wang, Fucang Jia
[paper] [code]

[arXiv 2023] UniBrain: Universal Brain MRI Diagnosis with Hierarchical Knowledge-enhanced Pre-training
Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya Zhang, Yanfeng Wang
[paper] [code]

[arXiv 2023] Unified Medical Image-Text-Label Contrastive Learning With Continuous Prompt
Yuhao Wang
[paper]

[arXiv 2023] Significantly Improving Zero-Shot X-ray Pathology Classification via Fine-tuning Pre-trained Image-Text Encoders
Jongseong Jang∗, Daeun Kyung∗, Seung Hwan Kim, Honglak Lee, Kyunghoon Bae, Edward Choi
[paper]

[arXiv 2023] IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training
Che Liu, Sibo Cheng, Miaojing Shi, Anand Shah, Wenjia Bai, Rossella Arcucci
[paper]

[arXiv 2024] AliFuse: Aligning and Fusing Multi-modal Medical Data for Computer-Aided Diagnosis
Qiuhui Chen, Xinyue Hu, Zirui Wang, Yi Hong
[paper]

[arXiv 2024] MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li, Laurence T. Yang, Bocheng Ren, Xin Nie, Zhangyang Gao, Cheng Tan, Stan Z. Li
[paper]


Knowledge-enhanced

[ACM MM 2022] Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge
Zhihong Chen, Guanbin Li, Xiang Wan
[paper] [code]

[ICCV 2023] MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-ray Diagnosis
Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
[paper] [code]

[MICCAI 2023] Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-Training
Xiaofei Chen, Yuting He, Cheng Xue, Rongjun Ge, Shuo Li, Guanyu Yang
[paper] [code]

[Nature Communication 2023] Knowledge-enhanced visual-language pre-training on chest radiology images
Xiaoman Zhang, Chaoyi Wu, Ya Zhang, WeidiXie & Yanfeng Wang
[paper] [code]

[npj digital medicine 2023] A medical multimodal large language model for future pandemics
Fenglin Liu, Tingting Zhu, Xian Wu, Bang Yang, Chenyu You, Chenyang Wang, Yefeng Zheng, Xu Sun, Yang Yang, Lei Clifton, David A. Clifton
[paper]

[arXiv 2023] Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining
Bingqian Lin, Zicong Chen, Mingjie Li, Haokun Lin, Hang Xu, Yi Zhu, Jianzhuang Liu, Wenjia Cai, Lei Yang, Shen Zhao, Chenfei Wu, Ling Chen, Xiaojun Chang, Yi Yang, Lei Xing, Xiaodan Liang
[paper] [code]

[arXiv 2023] A Foundation LAnguage-Image model of the Retina (FLAIR): Encoding expert knowledge in text supervision
Julio Silva-Rodriguez, Hadi Chakor, Riadh Kobbi, Jose Dolz, Ismail Ben Ayed
[paper] [code]


Others

[MLHC 2022] Contrastive Learning of Medical Visual Representations from Paired Images and Text
Yuhao Zhang, Hang Jiang, Yasuhide Miura, Christopher D. Manning, Curtis P. Langlotz
[paper] [code]

[NMI 2022] Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports
Hong-Yu Zhou, Xiaoyu Chen, Yinghao Zhang, Ruibang Luo, Liansheng Wang, Yizhou Yu
[paper] [code]

[ICCV 2023] Towards Unifying Medical Vision-and-Language Pre-Training via Soft Prompts
Zhihong Chen, Benyou Wang, Shizhe Diao, Guanbin Li, Xiang Wan
[paper] [code]

[ICCV 2023] Cross-Modal Translation and Alignment for Survival Analysis
Fengtao Zhou, Hao Chen
[paper] [code]

[NeurIPS 2023] Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
Zhongwei Wan, Che Liu, Mi Zhang, Jie Fu, Benyou Wang, Sibo Cheng, Lei Ma, César Quilodrán-Casas, Rossella Arcucci
[paper] [code]

[MICCAI 2023] M-FLAG: Medical Vision-Language Pre-training with Frozen Language Models and Latent Space Geometry Optimization
Che Liu, Sibo Cheng, Chen Chen, Mengyun Qiao, Weitong Zhang, Anand Shah, Wenjia Bai, Rossella Arcucci
[paper] [code]

[MICCAI 2023] Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction
Kexin Ding, Mu Zhou, Dimitris N. Metaxas, Shaoting Zhang
[paper] [code]

[MICCAI 2023] Surgical Video Captioning with Mutual-Modal Concept Alignment
Zhen Chen, Qingyu Guo, Leo K. T. Yeung, Danny T. M. Chan, Zhen Lei, Hongbin Liu & Jinqiao Wang
[paper] [code]

[ICASSP 2024] Freeze the backbones: A Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-training
Jiuming Qin, Che Liu, Sibo Cheng, Yike Guo, Rossella Arcucci
[paper]

[arXiv 2023] Utilizing Synthetic Data for Medical Vision-Language Pre-training: Bypassing the Need for Real Images
Che Liu, Anand Shah, Wenjia Bai, Rossella Arcucci
[paper]

[arXiv 2024] Benchmarking PathCLIP for Pathology Image Analysis
Sunyi Zheng, Xiaonan Cui, Yuxuan Sun, Jingxiong Li, Honglin Li, Yunlong Zhang, Pingyi Chen, Xueping Jing, Zhaoxiang Ye, Lin Yang
[paper]


CLIP-driven Application

Classification

[MICCAI 2022] CLIP-Lung: Textual Knowledge-Guided Lung Nodule Malignancy Prediction
Yiming Lei, Zilong Li, Yan Shen, Junping Zhang, Hongming Shan
[paper] [code]

[ACL 2022] Language over Labels: Contrastive Language Supervision Exceeds Purely Label-Supervised Classification Performance on Chest X-Rays
Anton Wiehe, Florian Schneider, Sebastian Blank, Xintong Wang, Hans-Peter Zorn, Christian Biemann
[paper] [code]

[ICCE-Asia 2022] Transfer Learning for Medical Image Classification on Multiple Datasets using PubMedCLIP
Hong N. Dao, Tuyen Nguyen Quang, Incheon Paik
[paper]

[Nature BME 2022] Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning
Ekin Tiu, Ellie Talius, Pujan Patel, Curtis P. Langlotz, Andrew Y. Ng & Pranav Rajpurkar
[paper] [code]

[ISBI 2023] Self-Supervised Learning with Radiology Reports, A Comparative Analysis of Strategies for Large Vessel Occlusion and Brain CTA Images
S Pachade, S Datta, Y Dong, S Salazar-Marioni, R Abdelkhaleq, A Niktabe, K Roberts, SA Sheth, L Giancardo
[paper]

[ISBI 2023] Joint representation learning from french radiological reports and ultrasound images
Hind Dadoun, Hervé Delingette, Anne-Laure Rousseau, Eric de Kerviler, Nicholas Ayache
[paper]

[ISBI 2023] Multimodal Representation Learning for Blastocyst Assessment
Youcheng Wang, Zhe Zheng, Na Ni, Guoqing Tong, Nuo Cheng, Kai Li, Ping Yin, Yuanyuan Chen, Yingna Wu, Guangping Xie
[paper]

[CEUR Workshop 2023] Multi-stage Medical Image Captioning using Classification and CLIP
Masaki Aono, Hiroki Shinoda, Tetsuya Asakawa, Kazuki Shimizu, Takuya Togawa, Takuyuki Komoda
[paper]

[MIDL 2023] Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language Models
Yuhao Zhang, Hang Jiang, Yasuhide Miura, Christopher D. Manning, Curtis P. Langlotz
[paper] [code]

[MIDL 2023] MEDIMP: 3D Medical Images with clinical Prompts from limited tabular data for renal transplantation
Leo Milecki, Vicky Kalogeiton, Sylvain Bodard, Dany Anglicheau, Jean-Michel Correas, Marc-Olivier Timsit, Maria Vakalopoulou
[paper] [code]

[MIDL 2023] Radiology Reports Improve Visual Representations Learned from Radiographs
Haoxu Huang, Samyak Rawlekar, Sumit Chopra, Cem M Deniz
[paper] [code]

[ICCV 2023 workshop] CLIPath: Fine-tune CLIP with Visual Feature Fusion for Pathology Image Analysis Towards Minimizing Data Collection Efforts
Zhengfeng Lai, Zhuoheng Li, Luca Cerny Oliveira, Joohi Chauhan, Brittany N. Dugger, Chen-Nee Chuah
[paper]

[MICCAI 2023] Xplainer: From X-Ray Observations to Explainable Zero-Shot Diagnosis
Chantal Pellegrini, Matthias Keicher, Ege Özsoy, Petra Jiraskova, Rickmer Braren, Nassir Navab
[paper] [code]

[MICCAI 2023 workshop] Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification
Injae Kim, Jongha Kim, Joonmyung Choi, Hyunwoo J. Kim
[paper]

[WACV 2024] I-AI: A Controllable & Interpretable AI System for Decoding Radiologists' Intense Focus for Accurate CXR Diagnoses
Trong Thang Pham, Jacob Brecheisen, Anh Nguyen, Hien Nguyen, Ngan Le
[paper] [code]

[ISBI 2024] Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models
Cristiano Patr´ıcio, Luis F. Teixeira, Joao C. Neves
[paper] [code]

[arXiv 2022] Towards Reliable Zero Shot Classification in Self-Supervised Models with Conformal Prediction
Bhawesh Kumar, Anil Palepu, Rudraksh Tuwani, Andrew Beam
[paper]

[arXiv 2023] Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuantian Chen, Chao Ma, Xiaokang Yang
[paper]

[arXiv 2023] ETP: Learning Transferable Ecg Representations Via Ecg-Text Pre-training
Che Liu, Zhongwei Wan, Sibo Cheng, Mi Zhang, Rossella Arcucci
[paper]

[arXiv 2023] A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis
Jiaxiang Liu, Tianxiang Hu, Yan Zhang, Xiaotang Gai, Yang Feng, Zuozhu Liu
[paper]

[arXiv 2023] Are Natural Domain Foundation Models Useful for Medical Image Classification?
Joana Palés Huix, Adithya Raju Ganeshan, Johan Fredin Haslum, Magnus Söderberg, Christos Matsoukas, Kevin Smith
[paper] [code]

[arXiv 2023] Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning
Fudan Zheng, Jindong Cao, Weijiang Yu, Zhiguang Chen, Nong Xiao, Yutong Lu
[paper]

[arXiv 2023] Exploring the Transfer Learning Capabilities of CLIP in Domain Generalization for Diabetic Retinopathy
Baliah, Sanoojan ; Maani, Fadillah A. ; Sanjeev, Santosh ; Haris Khan, Muhammad
[paper] [code]

[arXiv 2023] Exploring the Versatility of Zero-Shot CLIP for Interstitial Lung Disease Classification (ICLR underview)
Cara Van Uden, Christian Bluethgen, Maayane Attias, Malgorzata Polacin, Haiwei Henry Guo, Neha Simha, Rishi Raj, Curtis Langlotz
[paper]

[arXiv 2023] Few-shot medical image classification with simple shape and texture text descriptors using vision-language models
Michal Byra, Muhammad Febrian Rachmadi, Henrik Skibbe
[paper] [code]

[arXiv 2023] Fostering transparent medical image AI via an image-text foundation model grounded in medical literature
Chanwoo Kim, Soham U. Gadgil, Alex J. DeGrave, Zhuo Ran Cai, Roxana Daneshjou, Su-In Lee
[paper] [code]

[arXiv 2023] Increasing Textual Context Size Boosts Medical Image-Text Matching
Idan Glassberg, Tom Hope
[paper] [code]

[arXiv 2023] Robust and Interpretable Medical Image Classifiers via Concept Bottleneck Models
An Yan, Yu Wang, Petros Karypis, Zexue He, Chengyu Dong, Zihan Wang, Yiwu Zhong, Jingbo Shang, Amilcare Gentili, Chun-Nan Hsu, Julian McAuley
[paper] [code]


Dense Prediction

[MICCAI 2022] Radiological Reports Improve Pre-training for Localized Imaging Tasks on Chest X-Rays
Philip Müller, Georgios Kaissis, Congyu Zou, Daniel Rueckert
[paper]

[ASMUS 2023] Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography
Rabin Adhikari, Manish Dhakal, Safal Thapaliya, Kanchan Poudel, Prasiddha Bhandari & Bishesh Khanal
[paper] [code]

[ICCV 2023] CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
Jie Liu, Yixiao Zhang, Jie-Neng Chen, Junfei Xiao, Yongyi Lu, Bennett A Landman, Yixuan Yuan, Alan Yuille, Yucheng Tang, Zongwei Zhou
[paper] [code]

[MICCAI 2023] Multiple Prompt Fusion for Zero-Shot Lesion Detection Using Vision-Language Models
Miaotian Guo, Huahui Yi, Ziyuan Qin, Haiying Wang, Aidong Men, Qicheng Lao
[paper]

[MICCAI 2023] Zero-shot Nuclei Detection via Visual-Language Pre-trained Models
Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yubo Fan, Yan Xu
[paper] [code]

[MICCAI 2023] TCEIP: Text Condition Embedded Regression Network for Dental Implant Position Prediction
Xinquan Yang, Jinheng Xie, Xuguang Li, Xuechen Li, Xin Li, Linlin Shen, Yongqiang Deng
[paper]

[MICCAI 2023] Continual Learning for Abdominal Multi-Organ and Tumor Segmentation
Yixiao Zhang, Xinyi Li, Huimiao Chen, Alan L. Yuille, Yaoyao Liu, Zongwei Zhou
[paper] [code]

[MICCAI 2023] TPRO: Text-prompting-based Weakly Supervised Histopathology Tissue Segmentation
Shaoteng Zhang, Jianpeng Zhang, Yutong Xie, Yong Xia
[paper] [code]

[NeurIPS 2023] Text Promptable Surgical Instrument Segmentation with Vision-Language Models
Zijian Zhou, Oluwatosin Alabi, Meng Wei, Tom Vercauteren, Miaojing Shi
[paper] [code]

[arXiv 2023] Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models
Kanchan Poudel, Manish Dhakal, Prasiddha Bhandari, Rabin Adhikari, Safal Thapaliya, Bishesh Khanal
[paper] [code]

[arXiv 2023] One-shot Localization and Segmentation of Medical Images with Foundation Models
Deepa Anand, Gurunath Reddy M, Vanika Singhal, Dattesh D. Shanbhag, Shriram KS, Uday Patil, Chitresh Bhushan, Kavitha Manickam, Dawei Gui, Rakesh Mullick, Avinash Gopal, Parminder Bhatia, Taha Kass-Hout
[paper]

[ICLR 2024] AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection
Qihang Zhou, Guansong Pang, Yu Tian, Shibo He, Jiming Chen
[paper] [code]


Cross-modal

[PMLH 2021] Retrieval-Based Chest X-Ray Report Generation Using a Pre-trained Contrastive Language-Image Model
Mark Endo, Rayan Krishnan, Viswesh Krishna, Andrew Y. Ng, Pranav Rajpurkar
[paper] [code]

[IPMI 2023] X-TRA: Improving Chest X-ray Tasks with Cross-Modal Retrieval Augmentation
Tom van Sonsbeek, Marcel Worring
[paper]

[ACL 2023] PubMedCLIP: How Much Does CLIP Benefit Visual Question Answering in the Medical Domain?
Sedigheh Eslami, Gerard de Melo, Christoph Meinel
[paper] [code]

[MIDL 2023] FlexR: Few-shot Classification with Language Embeddings for Structured Reporting of Chest X-rays
Matthias Keicher, Kamilia Zaripova, Tobias Czempiel, Kristina Mach, Ashkan Khakzar, Nassir Navab
[paper]

[MICCAI 2023] Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Tom van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, and Marcel Worring
[paper] [code]

[MICCAI 2023] A Medical Semantic-Assisted Transformer for Radiographic Report Generation
Zhanyu Wang, Mingkang Tang, Lei Wang, Xiu Li, Luping Zhou
[paper]

[TETCI 2023] Parameter-Efficient Transfer Learning for Medical Visual Question Answering
Jiaxiang Liu , Tianxiang Hu, Yan Zhang, Yang Feng, Jin Hao , Junhui Lv, and Zuozhu Liu
[paper]

[AAAI 2024] CLIPSyntel: CLIP and LLM Synergy for Multimodal Question Summarization in Healthcare
Akash Ghosh*, Arkadeep Acharya*, Raghav Jain, Sriparna Saha, Aman Chadha, Setu Sinha
[paper] [code]

[arXiv 2023] PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering
Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Weixiong Lin, Ya Zhang, Yanfeng Wang, Weidi Xie
[paper] [code]