/MachineLearningLeukemiaDetection

Six machine learning methods detect acute myeloid leukemia based on genetic and hematological data

Primary LanguageJupyter Notebook

Using Machine Learning to Detect Acute Myeloid Leukemia

DS1003 Machine Learning, Spring 2018

Erica Dominic and Joanna Bitton

Abstract

Acute myeloid leukemia (AML) is a blood cancer correlated with certain genetic mutations. In this project, we implement six machine learning methods with the intent of detecting AML, given a patient's genetic profile and hematological data. The six models include support vector machines, classification trees, boosted trees, random forests, Naive Bayes, and logistic regression. Models were optimized using the metric area under the receiver operating characteristic (ROC) curve, and accuracy scores were also recorded. Our results indicate that the models with the highest accuracy (just over 70%) were random forests and logistic regression. The models with the lowest accuracy were support vector machines and Naive Bayes (approximately 60%).

See Report.pdf for more information.