MedicalLLMs

This repository is for collecting and updating existing and upcoming LLM tools and papers focusing on healthcare and medicine domain.

Summary of LLMs in medicine, healthcare, and clinical settings.

This table is updated by 12 October 2023 including GatorTron, Bio-BERT, PubMedBert, BioMegatron, ClinicalBERT, Med-PaLM 1 & 2.

Model	Paper	Code	Complexity	Data	Tasks
GatorTron	Link	NVIDIA Hugging Face	Base: 345M Medium: 3.90B Large: 8.90B	1. n2c2 NLP datasets 2. MedNLI dataset 3. emrQA dataset 4. MIMIC III dataset 5. PubMed dataset 6. Wikipedia dataset 7. UF clinical notes (Close)	- Concept extraction - Relation extraction - Semantic textual similarity - Natural language inference (NLI) - Question answering
Bio-BERT	Link	Github	BERT(Wiki+Books): 1M BioBERT(+PubMed): 1M BioBERT(+PMC): 270K BioBERT(+PubMed,PMC): 470K	1. English Wikipedia 2. BooksCorpus 3. PubMed Abstracts 4. PMC Full-text articles	- Name Entity Recognition (NER) - Relation Extraction - Question answering
PubMedBert	Link	Hugging Face	Built-on BERT: 1M	1. PubMed Abstracts 2. PubMed Full-text	- NER - Information extraction - Relation extraction - Semantic similarity - Document classification - Question answering
BioMegatron	Link	Close	Built-on Metagron: 8.3B BioMegatron S: 345M BioMegatron M: 800M BioMegatron L: 1.2 B	Megatron: 1. Wikipedia 2. CC-Stories 3. RealNews 4. OpenWebtext BioMegatron: 5. PubMed abstract (4.5B) 6. PMC full-text (1.6B)	- NER - Relation Extraction - Question answering
ClinicalBERT	Link	Github Hugging Face	Built-on BERT: 1M Built-on BioBERT: 1M	MIMIC all Clinical notes MIMIC Discharge Summary	- NER - Concept extraction - NLI
Med-PaLM1	Link	Close	Built-on PaLM: 540B	MultiMedQA: (medical exams & research datasets) 1. MedQA 2. MedMCQA 3. PubMedQA 4. LiveQA 5. MedicationQA 6. MMLU clinical topics HealthSearchQA (curated searched health queries) 7.HealthSearchQA	Question answering
Med-PaLM2	Link	Close	Based on PaLM 2	1. MedQA 2. MedMCQA 3. PubMedQA 4. MMLU clinical topics 5. HealthSearchQA 6. LiveQA 7. MedicationQA	Question answering

Evaluation methods / Benchmarks

Platform	Paper	Code	Tasks	Metrics	Datasets
HELM	Link	Github	- Question answering - Information retrieval - Summarization - Sentiment analysis - Reasoning ... other 12 tasks	Accuracy, Calibration, Robustness, Fairness, Bias, Toxicity, Efficiency, General Info, Summarization, Disinformation, Copyright, Classification, AAPS Metrics,>and BBQ Metrics	Download Page
BLURB	Link	404	- NER - Question answering - Information retrieval - Relation extraction - Semantic similarity - Classification	Accuracy (F1, correlation)	Download Page
GLUE	Link	Github	- Question answering - Semantic similarity - NLI - Sentiment analysis - Coreference resolution	Accuracy (F1, correlation)	Download Page
SUPERGLUE	Link	Github	- Question answering - Reasoning - Classification - Text Entailment - Coreference resolution	Accuracy (F1, correlation)	Download Page

Summary of Knowledge-enhanced LLMs

There is a comprehensive review paper on Knowledge-enhanced LLM: A Survey of Knowledge Enhanced Pre-Trained Language Models A comparable work of Knowledge-enhanced medical LLM: SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining

MaastrichtU-IDS/MedicalLLM

MedicalLLMs

Summary of LLMs in medicine, healthcare, and clinical settings.

Evaluation methods / Benchmarks

Summary of Knowledge-enhanced LLMs