/ML_in_Biomed

Machine Learning in Healthcare and Biomedical Applications

Machine Learning in Healthcare and Biomedical Applications

  • Data driven decision making
  • Questions -> Data -> Models/Tools

Table of Contents

  1. Overview
  2. EHR data
  3. Insurance claims data
  4. Clinical notes
  5. Image data
  6. Time series data
  7. Genomics data

Overview

Data type Models/Tools Applications
-EHR data
-Insurance claims data
ML(logistic regression,XGBoost) Predict outcomes (disease, death, readmission etc.)
-Clinical notes
-Conversation text data
-Rule based approach(regular expression)
-Deep learning apporach
-Extract concepts from clinical notes
-Knowledge graphs
-Chat-bot
-QA system
Medical image data (X-ray, CT, OCR image etc.) CNN -Detection: diagnosis of skin cancer lung nodule or diabetic reinopathy
-Segmentation of tumor, histopathology
Time series data (EEG, ECG, vital sign data etc.) HMM,RNN,CNN -Heart disease
-Sleep disorder(apnea)
-ICU monitoring
Genomics data GATK,QIIME -Cancer mutation identification
-Biomarker identification
-Durg discovery
Other data (hospital operational data) -ML(regression)
-Queueing model
-Reduce operational cost
-Improve patient experience
-ER wait time and queueing

EHR data

Prediction outcomes Models/Tools Data type Sample size Reference Year
Review Mining electronic health records: towards better research applications and clinical care 2012
Review Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review 2016
heart failure -logistic regression
-random forest
longitudinal EHR data 1684 heart failure cases and 13525 matched controls Early Detection of Heart Failure Using Electronic Health Records 2016
heart failure (review) Population Risk Prediction Models for Incident Heart Failure 2015
Kidney transplant graft failure Cox regression 10-years EHR data 69,440 kidney transpants A comprehensive risk quantification score for deceased donor kidneys: the kidney donor risk index 2009

Clinical notes

Prediction outcomes Models/Tools Data type Sample size Reference Year
Review Realizing the full potential of electronic health records: the role of natural language processing 2011
Review Natural language processing: an introduction 2011
Negation Regular expression and rule-based approach Clinical reports 2060 discharge summaries A simple algorithm for identifying negated findings and diseases in discharge summaries 2001
Using electronic health records to drive discovery in disease genomics
NER discharge summaries 826 notes A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries 2011

Image data

Prediction outcomes Models/Tools Data type Sample size Reference Year
Diabetic retinopathy CNN retinal fundus images 128175 retinal images Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs 2016
Skin cancer CNN skin images 129,450 skin images Dermatologist-level classification of skin cancer with deep neural networks 2017
Tumor CNN Pathology images 400+110 slides Detecting Cancer Metastases on Gigapixel Pathology Images 2017
Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments

Time series data

Prediction outcomes Models/Tools Data type Sample size Reference Year
sinus rhythm and atrial fibrillation 34-layer convolutional neural network (CNN) single-lead ECG -(Train) 64,121 ECG records from 29,163 patients
-(Test) 336 records from 328 unique patients
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks 2017
Hand movements CNN sEMG 67 intact subjects and 11 transradial amputees Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands 2016
Review ICU data Machine Learning and Decision Support in Critical Care 2017

Genomics data

Prediction outcomes Models/Tools Data type Sample size Reference Year
Genetic variants Exome NGS NGS&EHR data 50,726 individuals Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study 2016
Familial hypercholesterolemia Exome NGS NGS&EHR data 50,726 individuals Genetic identification of familial hypercholesterolemia within a single U.S. health care system 2016

Other

Prediction outcomes Models/Tools Data type Sample size Reference Year
Drug discovery LSTM Assay 12-27 assays Low data drug discovery with one-shot learning 2017
Tutorial Image Deep learning models for health care: challenges and solutions 2017
Tutorial Image Deep learning in radiology: recent advances, challenges and future trends 2016
Tutorial Big data analytics for healthcare 2013
Tutorial Image Survey of deep learning in radiology 2017
ER wait time ER visit time Accurate ED Wait Time Prediction 2017