MedLLMsPracticalGuide: A repository from wenyu332

The Practical Guides for Medical Large Language Models

If you like our project, please give us a star ⭐ on GitHub for the latest update.

This is an actively updated list of practical guide resources for medical large language models (LLMs). It's based on our survey paper: A Survey of Large Language Models in Medicine: Progress, Application, and Challenge.

⚡ Contributing

If you want to add your work or model to this list, please do not hesitate to email fenglin.liu@eng.ox.ac.uk and xyzou@uwaterloo.ca or pull requests. Markdown format:

- [**Name of Conference or Journal + Year**] Paper Name. [[pdf]](link) [[code]](link)

🤗 Highlights

This repository aims to provide an overview of the progress, application, and challenge of LLMs in medicine, aiming to promote further research and exploration in this interdisciplinary field.

📣 Update Notes

[2023-11-09] We released the repository.

Practical Guide for Building Pipeline
Practical Guide for Medical Data
Downsteam Biomedical Tasks
- TODO
Practical Guide for Clinical Applications
Practical Guide for Challenges
Practical Guide for Future Directions
Acknowledgement
Citation

🔥 Practical Guide for Building Pipeline

Pre-training from Scratch

BioBERT: A pre-trained biomedical language representation model for biomedical text mining. 2020. paper
PubMedBERT：Domain-specific language model pretraining for biomedical natural language processing. 2021. paper
SciBERT：A pretrained language model for scientific text. 2019. paper
ClinicalBERT：Publicly available clinical BERT embeddings. 2019. paper
BlueBERT：Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. 2019. paper
BioCPT：Contrastive pre-trained transformers with large-scale pubmed search logs for zero-shot biomedical information retrieval. 2023. paper
BioGPT：generative pre-trained transformer for biomedical text generation and mining. 2022. paper
OphGLM：Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue. 2023. paper
GatorTron：A large language model for electronic health records. 2022. paper
GatorTronGPT：A Study of Generative Large Language Model for Medical Research and Healthcare. 2023. paper

Fine-tuning General LLMs

ChatGLM-Med：ChatGLM-Med: 基于中文医学知识的ChatGLM模型微调. 2023. github
DoctorGLM：Fine-tuning your chinese doctor is not a herculean task. 2023. paper
BianQue：Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT. 2023. paper
ClinicalGPT：Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation. 2023. paper
Qilin-Med：Multi-stage Knowledge Injection Advanced Medical Large Language Model. 2023. paper
Qilin-Med-VL：Towards Chinese Large Vision-Language Model for General Healthcare. 2023. paper
ChatDoctor：A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge. 2023. paper
BenTsao：Tuning llama model with chinese medical knowledge. 2023. paper
HuatuoGPT: HuatuoGPT, towards Taming Language Model to Be a Doctor. 2023. paper
LLaVA-Med: Training a large language-and-vision assistant for biomedicine in one day. 2023. paper
Baize-healthcare: An open-source chat model with parameter-efficient tuning on self-chat data. 2023. paper
Visual Med-Alpeca: A parameter-efficient biomedical llm with visual capabilities. 2023. Repo
PMC-LLaMA: Further finetuning llama on medical papers. 2023. paper
Clinical Camel: An Open-Source Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding. 2023. paper
MedPaLM 2: Towards expert-level medical question answering with large language models. 2023. paper
MedPaLM M: Towards generalist biomedical ai. 2023. paper
CPLLM: Clinical Prediction with Large Language Models. 2023. paper

Prompting General LLMs

DelD-GPT: Zero-shot medical text de-identification by gpt-4. 2023. paper
ChatCAD: Interactive computer-aided diagnosis on medical image using large language models. 2023. paper
Dr. Knows: Leveraging a medical knowledge graph into large language models for diagnosis prediction. 2023. paper
MedPaLM: Large language models encode clinical knowledge. 2022. paper

📊 Practical Guide for Medical Data

Clinical Knowledge Bases

Drugs.com
DrugBank
NHS Health
NHS Medicine
Unified Medical Language System (UMLS)
The Human Phenotype Ontology

Pre-training Data

PubMed: National Institutes of Health. PubMed Data. In National Library of Medicine. 2022. database
Literature: Construction of the literature graph in semantic scholar. 2018. paper
MIMIC-III: MIMIC-III, a freely accessible critical care database. 2016. paper
PubMed: The pile: An 800gb dataset of diverse text for language modeling. 2020. paper
MedDialog: Meddialog: Two large-scale medical dialogue datasets. 2020. paper
EHRs: A large language model for electronic health records. 2022. paper
EHRs: A Study of Generative Large Language Model for Medical Research and Healthcare. 2023. paper

Fine-tuning Data

cMeKG：Chinese Medical Knowledge Graph. 2023. github
CMD.: Chinese medical dialogue data. 2023. repo
BianQueCorpus: BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT. 2023. paper
MD-EHR: ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation. 2023. paper
VariousMedQA: Multi-scale attentive interaction networks for chinese medical question answer selection. 2018. paper
VariousMedQA: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. 2021. paper
MedDialog: Meddialog: Two large-scale medical dialogue datasets. 2020. paper
ChiMed: Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model. 2023. paper
ChiMed-VL: Qilin-Med-VL: Towards Chinese Large Vision-Language Model for General Healthcare. 2023. paper
Healthcare Magic: Healthcare Magic. platform
ICliniq: ICliniq. platform
Hybrid SFT: HuatuoGPT, towards Taming Language Model to Be a Doctor. 2023. paper
PMC-15M: Large-scale domain-specific pretraining for biomedical vision-language processing. 2023. paper
MedQuAD: A question-entailment approach to question answering. 2019. paper
VariousMedQA: Visual med-alpaca: A parameter-efficient biomedical llm with visual capabilities. 2023. repo
MTB: Med-flamingo: a multimodal medical few-shot learner. 2023. paper
PMC-OA: Pmc-clip: Contrastive language-image pre-training using biomedical documents. 2023. paper
Medical Meadow: MedAlpaca--An Open-Source Collection of Medical Conversational AI Models and Training Data. 2023. paper
Literature: S2ORC: The semantic scholar open research corpus. 2019. paper
MedC-I: Pmc-llama: Further finetuning llama on medical papers. 2023. paper
ShareGPT: Sharegpt. 2023. platform
PubMed: National Institutes of Health. PubMed Data. In National Library of Medicine. 2022. database
MedQA: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. 2021. paper
MultiMedQA: Towards expert-level medical question answering with large language models. 2023. paper
MultiMedBench: Towards generalist biomedical ai. 2023. paper

✨ Practical Guide for Clinical Applications

Medical Diagnosis

Designing a Deep Learning-Driven Resource-Efficient Diagnostic System for Metastatic Breast Cancer: Reducing Long Delays of Clinical Diagnosis and Improving Patient Survival in Developing Countries. 2023. paper
AI in health and medicine. 2022. paper
Large language models in medicine. 2023. paper
Leveraging a medical knowledge graph into large language models for diagnosis prediction. 2023. paper
Chatcad: Interactive computer-aided diagnosis on medical image using large language models. 2023. paper

Formatting and ICD-Coding

Applying large language model artificial intelligence for retina International Classification of Diseases (ICD) coding. 2023. paper
PLM-ICD: automatic ICD coding with pretrained language models. 2022. paper

Clinical Report Generation

Using ChatGPT to write patient clinic letters. 2023. paper
ChatGPT: the future of discharge summaries?. 2023. paper
Chatcad: Interactive computer-aided diagnosis on medical image using large language models. 2023. paper
Can GPT-4V (ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis. 2023. paper
Qilin-Med-VL: Towards Chinese Large Vision-Language Model for General Healthcare. 2023. paper
Customizing General-Purpose Foundation Models for Medical Report Generation. 2023. paper
Towards generalist foundation model for radiology. 2023. paper
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts. 2023. paper

Medical Education

Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. 2023. paper
The Advent of Generative Language Models in Medical Education. 2023. paper
The impending impacts of large language models on medical education. 2023. paper

Medical Robotics

A Nested U-Structure for Instrument Segmentation in Robotic Surgery. 2023. paper
The multi-trip autonomous mobile robot scheduling problem with time windows in a stochastic environment at smart hospitals. 2023. paper
Advanced robotics for medical rehabilitation. 2016. paper
GRID: Scene-Graph-based Instruction-driven Robotic Task Planning. 2023. paper
Trust in Construction AI-Powered Collaborative Robots: A Qualitative Empirical Analysis. 2023. paper

Medical Language Translation

Machine translation of standardised medical terminology using natural language processing: A Scoping Review. 2023. paper
The Advent of Generative Language Models in Medical Education. 2023. paper
The impending impacts of large language models on medical education. 2023. paper

Mental Health Support

ChatCounselor: A Large Language Models for Mental Health Support. 2023. paper
Tell me, what are you most afraid of? Exploring the Effects of Agent Representation on Information Disclosure in Human-Chatbot Interaction, 2023, paper
A Brief Wellbeing Training Session Delivered by a Humanoid Social Robot: A Pilot Randomized Controlled Trial. 2023. paper
Real conversations with artificial intelligence: A comparison between human–human online conversations and human–chatbot conversations. 2015. paper

⚔️ Practical Guide for Challenges

Hallucination

Survey of hallucination in natural language generation. 2023. paper
Med-halt: Medical domain hallucination test for large language models. 2023. paper
A survey of hallucination in large foundation models. 2023. paper
Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. 2023. paper
Retrieval augmentation reduces hallucination in conversation. 2021. paper
Chain-of-verification reduces hallucination in large language models. 2023. paper

Lack of Evaluation Benchmarks and Metrics

What disease does this patient have? a large-scale open domain question answering dataset from medical exams. 2021. paper
Truthfulqa: Measuring how models mimic human falsehoods. 2021. paper
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models. 2023. paper

Domain Data Limitations

Textbooks Are All You Need. 2023. paper
Model Dementia: Generated Data Makes Models Forget. 2023. paper

New Knowledge Adaptation

Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark. 2023. paper
Editing Large Language Models: Problems, Methods, and Opportunities. 2023. paper
Retrieval-augmented generation for knowledge-intensive nlp tasks. 2020. paper

Behavior Alignment

Aligning ai with shared human values. 2020. paper
Training a helpful and harmless assistant with reinforcement learning from human feedback. 2022. paper
Improving alignment of dialogue agents via targeted human judgements. 2022. paper
Webgpt: Browser-assisted question-answering with human feedback. 2021. paper
Languages are rewards: Hindsight finetuning using human feedback. 2023. paper

Ethical, Legal, and Safety Concerns

ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. 2023. paper
ChatGPT listed as author on research papers: many scientists disapprove. 2023. paper
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics. 2023. paper
Multi-step jailbreaking privacy attacks on chatgpt. 2023. paper
Jailbroken: How does llm safety training fail?. 2023. paper

🚀 Practical Guide for Future Directions

Introduction of New Benchmarks

A comprehensive benchmark study on biomedical text generation and mining with ChatGPT. 2023. paper
Creation and adoption of large language models in medicine. 2023. paper

Interdisciplinary Collaborations

Creation and adoption of large language models in medicine. 2023. paper
ChatGPT and Physicians' Malpractice Risk. 2023. paper

Multi-modal LLM

A Survey on Multimodal Large Language Models. 2023. paper
Mm-react: Prompting chatgpt for multimodal reasoning and action. 2023. paper
ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. 2023. paper
Frozen Language Model Helps ECG Zero-Shot Learning. 2023. paper
Exploring and Characterizing Large Language Models For Embedded System Development and Debugging. 2023. paper
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models. 2023. paper

LLMs in less established fields of healthcare

Large Language Models in Sport Science & Medicine: Opportunities, Risks and Considerations. 2023. paper

👍 Acknowledgement

LLMsPracticalGuide. The codebase we built upon and it is a comprehensive LLM suvey.

📑 Citation

Please consider citing 📑 our papers if our repository is helpful to your work, thanks sincerely!

@article{zhou2023survey,
   title={A Survey of Large Language Models in Medicine: Progress, Application, and Challenge},
   author={Hongjian Zhou, Boyang Gu, Xinyu Zou, Yiru Li, Sam S. Chen, Peilin Zhou, Junling Liu, Yining Hua, Chengfeng Mao, Xian Wu, Zheng Li, Fenglin Liu},
   journal={arXiv preprint 2311.05112}
   year={2023}
}

wenyu332/MedLLMsPracticalGuide