/awesome-ehr-deeplearning

Curated list of awesome papers for electronic health records(EHR) mining, machine learning, and deep learning.

Creative Commons Zero v1.0 UniversalCC0-1.0

Awesome Deep Learning and EHRs

Curated list of awesome papers for electronic health records (EHR) mining, machine learning, and deep learning.

Background

Over the past decade, the volume of EHR has exploded and will be in the future. Thanks to advances in machine learning and deep learning techniques, electronic health records have recognized as a powerful resource to tackle clinical challenges. We make a collection of must-read papers on various EHR topics - recent research trends, applications to predict patient outcomes, deployment in the real-world, and visualization of complex data.

Contents


Survey

  • [pdf] - Machine Learning Applications for Therapeutic Tasks with Genomics Data, K. Huang, et al. 2021.

  • [pdf] - Patient similarity: methods and applications, L. Dai, et al. 2020.

  • [pdf] - DeepHealth: Deep Learning for Health Informatics reviews, challenges, and opportunities on medical imaging,electronic health records, genomics, sensing, and online communication health, G. H. Kwak, et al. 2019.

  • [pdf] - Reinforcement Learning in Healthcare: A Survey, C. Yu, et al. 2019.

  • [ref] - A guide to deep learning in healthcare, A. Esteva et al. 2019.

  • [pdf] - Deep EHR: A survey of Recent Advances on Deep Learning Techniques for Electronic Health Record(EHR) Analysis, B. Shickel et al. 2018.

  • [pdf] - Opportunities in Machine Learning for Healthcare, M. Ghassemi et al. 2018.

  • [pdf] - Big Data and Machine Learning in Health Care, A. L. Beam et al. 2018.

  • [pdf] - Big data from electronic health records for early and late translational cardiovascular research: challenges and potential, H. Hemingway et al. 2017.

  • [pdf] - Mining Electronic Health Records: A Survey, P. Yadav et al. 2017.

Data mining & data quality

  • [pdf] - A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization, M. Liang et al. 2021.

  • [pdf] - Examining the impact of data quality and completeness of electronic health records on predictions of patients risks of cardiovascular disease, Y. Li et al. 2019.

  • [pdf] - MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III, S. Wang et al. 2019.

  • [pdf] - Development and validation of computable Phenotype to Identify and Characterize Kidney Health in Adult Hospitalized Patients, T. Ozrazgat-Baslanti et al. 2019.

  • [ref] - Disease Heritability Inferred from Familial Relationships Reported in Medical Records, F. Polubriaginof et al. 2017.

  • [pdf] - Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins, Y. Zhu et al. 2016.

  • [pdf] - Predicting inpatient clinical order patterns with probabilistic topic models vs conventional order sets, J. Chen et al. 2016.

  • [pdf] - Modeling temporal relationships in large scale clinical associations, D. Hanauer et al. 2012.

  • [ref] - Mining electronic health records: towards better research applications and clinical care, P. B. Jensen et al. 2012.

Statistics

  • [pdf] - Improvement in Cardiovascular Risk Prediction with Electronic Health Records, M. M. Pike et al. 2016.

Machine learning

  • [pdf] - Risk Markers by Sex and Age Group for In-Hospital Mortality in Patients with STEMI or NSTEMI: an Approach based on Machine Learning, B. Vázquez et al. 2021.

  • [pdf] - High-throughput Phenotyping with Temporal Sequences, H. Estiri et al. 2019.

  • [pdf] - The Medical Deconfounder: Assessing Treatment Effect with Electronic Health Records (EHRs), L. Zhang et al. 2019.

  • [pdf] - Interpretation of machine learning predictions for patient outcomes in electronic health records, W. L. Cava et al. 2019.

  • [pdf] - A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data, S. B. Golas et al. 2018.

  • [pdf] - Evaluating electronic health record data sources and algorithmic approaches to identify hypertensive individuals, P. L. Teixeira et al. 2017.

Deep learning

Embedding & Representation

  • [pdf] - Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data, T. Wanyan et al. 2021.

  • [pdf] - Handling Non-ignorably Missing Features in Electronic Health Records Data Using Importance-Weighted Autoencoders, D. K. Lim et al. 2021.

  • [pdf] - Heterogeneous Similarity Graph Neural Network on Electronic Health Records, Z. Liu et al. 2021.

  • [pdf] - Predicting Patient Outcomes with Graph Representation Learning, E. Rocheteau et al. 2021.

  • [pdf] - EVA: Generating Longitudinal Electronic Health Records Using Conditional Variational Autoencoders, S. Biswal et al. 2021.

  • [pdf] - Doctor2Vec: Dynamic Doctor Representation Learning for Clinical Trial Recruitment, S. Biswal et al. 2019.

  • [pdf] - Modelling EHR timeseries by restricting feature interaction, K. Zhang et al. 2019.

  • [pdf] - BEHRT: Transformer for Electronic Health Records, Y. Li et al. 2019.

  • [pdf] - TAPER: Time-Aware Patient EHR Representation, S. Darabi et al. 2019.

  • [pdf] - Modeling Irregularly Sampled Clinical Time Series, S. N. Shukla et al. 2019.

  • [pdf] - TIFTI: A Framework for Extracting Drug Intervals from Longitudinal Clinic Notes, M. Agrawal et al. 2019.

  • [pdf] - Graph Convolutional Transformer: Learning the Graphical Structure of Electronic Health Records, E. Choi et al. 2019.

  • [pdf] - Identification of Predictive Sub-Phenotypes of Acute Kidney Injury using Structured and Unstructured Electronic Health Record Data with Memory Networks, Z. Xu et al. 2019.

  • [pdf] - Learning Hierarchical Representations of Electronic Health Records for Clinical Outcome Prediction, L. Liu et al. 2019.

  • [pdf] - Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding, L. Gligic et al. 2019.

  • [pdf] - Application of Clinical Concept Embeddings for Heart Failure Prediction in UK EHR data, M. Agrawal et al. 2019.

  • [pdf] - Embedding Electronic Health Records for Clinical Information Retrieval, X. Wei et al. 2019.

  • [pdf] - Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record, S. Denaxas et al. 2018.

  • [pdf] - MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare, E. Choi et al. 2018.

  • [pdf] - Deep Representation for Patient Visits from Electronic Health Records, J. Escudie et al. 2018.

  • [pdf] - Learning Patient Representations from Text, D. Dligach, et al. 2018.

  • [pdf] - Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records, R. Miotto et al. 2016.

NLP

  • [pdf] - A frame semantic overview of NLP-based information extraction for cancer-related EHR notes, S. Datta et al. 2019.

  • [pdf] - Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches, W. L. Cava et al. 2019.

  • [pdf] - Named Entity Recognition in Electronic Health Records Using Transfer Learning Bootstrapped Neural Networks, Z. Zhu et al. 2019.

  • [pdf] - Unsupervised Pseudo-Labeling for Extractive Summarization on Electronic Health Records, X. Liu et al. 2019.

  • [pdf] - Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) Utilizing Free-Text Clinical Narratives, I. Banerjee et al. 2018.

  • [pdf] - Biomedical Question Answering via Weighted Neural Network Passage Retrieval, F. Galkó et al. 2018.

  • [pdf] - Natural Language Generation for Electronic Health Records, S. Lee. 2018.

  • [pdf] - Natural Language Processing for EHR-Based Computational Phenotyping, Z. Zeng et al. 2018.

  • [pdf] - Using Clinical Narratives and Structured Data to Identify Distant Recurrences in Breast Cancer, Z. Zeng et al. 2018.

Privacy

  • [pdf] - Federated and Differentially Private Learning for Electronic Health Records, S. R. Pfohl et al. 2019.

  • [pdf] - A Fully Private Pipeline for Deep Learning on Electronic Health Records, E. Chou et al. 2019.

Prediction

  • [pdf] - Neural Clinical Event Sequence Prediction through Personalized Online Adaptive Learning, J. M. Lee et al. 2021.

  • [pdf] - MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records, Z. Xu et al. 2021.

  • [pdf] - Multi-Time Attention Networks for Irregularly Sampled Time Series, S. N. Shukla et al. 2021.

  • [pdf] - DICE: Significance Clustering for Outcome-Aware Stratification, Y. Huang et al. 2021.

  • [pdf] - Modeling Multivariate Clinical Event Time-series with Recurrent Temporal Mechanisms, J. M. Lee et al. 2021.

  • [pdf] - Multi-scale Temporal Memory for Clinical Event Time-Series Prediction, J. M. Lee et al. 2020.

  • [pdf] - Interpolation-Prediction Networks for Irregularly Sampled Time Series, S. N. Shukla et al. 2019.

  • [pdf] - Improved Patient Classification with Language Model Pretraining Over Clinical Notes, J. Kemp et al. 2019.

  • [pdf] - An attention based deep learning model of clinical events in the intensive care unit, D. A. Kaji et al. 2019.

  • [pdf] - Early detection of sepsis utilizing deep learning on electronic health record event sequences, S. M. Lauritsen et al. 2019.

  • [pdf] - MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records, X. X. Zhang et al. 2019.

  • [pdf] - Predicting Diabetes Disease Evolution Using Financial Records and Recurrent Neural Networks, R. T. Sousa et al. 2019.

  • [pdf] - Recent context-based LSTM for Clinical Event Time-series Prediction, J. Lee et al. 2018.

  • [pdf] - Deep Diabetologist: Learning to Prescribe Hyperglycemia Medications with Hierarchical Recurrent Neural Networks, J. Mei et al. 2018.

  • [pdf] - Expert System for Diagnosis of Chest Diseases Using Neural Networks, I. Kayali et al. 2018.

  • [pdf] - Scalable and accurate deep learning with electronic health records, A. Rajkomar et al. 2018.

  • [pdf] - HeteroMed: Heterogeneous Information Network for Medical Diagnosis, A. Hosseini et al. 2018.

  • [pdf] - Generating Multi-label Discrete Patient Records using Generative Adversarial Networks, E. Choi, et al. 2018.

  • [pdf] - Countdown Regression: Sharp and Calibrated Survival Predictions, A. Avati et al. 2018.

  • [pdf] - Mixed Effect Composite RNN-GP: A Personalized and Reliable Prediction Model for Healthcare, I. Chung et al. 2018.

  • [pdf] - Uncertainty-Aware Attention for Reliable Interpretation and Prediction, J. Heo et al. 2018.

  • [pdf] - Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation, L. Wang, et al. 2018.

  • [pdf] - A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading, J. Torre et al. 2017.

  • [pdf] - Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology, A. Holzinger et al. 2017.

  • [pdf] - Modeling Missing Data in Clinical Time Series with RNNs, Z. C. Lipton et al. 2016.

Explainability

  • [pdf] - Risk factor identification for incident heart failure using neural network distillation and variable selection, Y. Li et al. 2021.

  • [pdf] - An explainable Transformer-based deep learning model for the prediction of incident heart failure, S. Rao et al. 2021.

  • [pdf] - Inference for the Case Probability in High-dimensional Logistic Regression, Z. Guo et al. 2021.

  • [pdf] - Concept-based Model Explanations for Electronic Health Records, S. Baur et al. 2020.

Visualization

  • [pdf] - Modeling and Leveraging Analytic Focus During Exploratory Visual Analysis, Z. Zhou et al. 2021.

  • [pdf] - Analyzing Time Attributes in Temporal Event Sequences, J. Magallanes et al. 2019.

  • [pdf] - Selection Bias Tracking and Detailed Subset Comparison for High-Dimensional Data, D. Borland et al. 2019.

  • [pdf] - DPVis: Visual Exploration of Disease Progression Pathways, B. C. Kwon et al. 2019.

  • [pdf] - MAQUI: Interweaving Queries and Pattern Mining for Recursive Event Sequence Exploration, P. Law et al. 2019.

  • [pdf] - EventAction: A Visual Analytics Approach to Explainable Recommendation for Event Sequences, F. Du et al. 2018

  • [pdf] - ClinicalVis: Supporting Clinical Task-Focused Design Evaluation, M. Ghassemi et al. 2018.

  • [pdf] - CarePre: An Intelligent Clinical Decision Assistance System, Z. Jin et al. 2018.

  • [pdf] - Visualizing Patient Timelines in the Intensive Care Unit, D. L. Lambert et al. 2018.

  • [pdf] - CoreFlow: Extracting and Visualizing Branching Patterns from Event Sequences, Z. Liu et al. 2017.

  • [pdf] - PhenoStacks: Cross-Sectional Cohort Phenotype Comparison Visualizations, M. Glueck et al. 2016.

  • [pdf] - Using Visual Analytics to Interpret Predictive Machine Learning Models, J. Krause et al. 2016.

  • [pdf] - Iterative cohort analysis and exploration, Z. Zhang et al. 2014.

  • [pdf] - An Evaluation of Visual Analytics Approaches to Comparing Cohorts of Event Sequences, S. Malik et al. 2014.

Standardization

  • [pdf] - Application of HL7 FHIR in a Microservice Architecture for Patient Navigation on Registration and Appointments, G. N. Bettoni, et al. 2021.

  • [pdf] - A semi-autonomous approach to connecting proprietary EHR standards to FHIR, M. Chapman, et al. 2019.

  • [pdf] - CREATE: Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records using OMOP Common Data Model, S. Liu et al. 2019.

Deployment

  • [pdf] - Trust Issues: Uncertainty Estimation Does Not Enable Reliable OOD Detection On Medical Tabular Data, D. Ulmer, et al. 2020.

  • [pdf] - Machine Learning in Precision Medicine to Preserve Privacy via Encryption, W. Briguglio, et al. 2021.

  • [pdf] - Adversarial Sample Enhanced Domain Adaptation: A Case Study on Predictive Modeling with Electronic Health Records, Y. Yu, et al. 2021.

Clinical Trial Recruitment

  • [pdf] - COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching, J. Gao, et al. 2020.

  • [pdf] - DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment Prediction, X. Zhang, et al. 2020.


Acknowledgement

Thank you for all your contributions. Please make sure to read the contributing guide before you make a pull request.