/blood_source_classifier

Training and validating a logistic regression classification model to predict acute myeloid leukemia blood source (bone marrow vs. peripheral blood) from gene expression data.

Primary LanguageJupyter Notebook

Description:

Traning and validating logistic regression model to classify sample blood source (bone marrow vs. peripheral blood) of 2,024 aml patients from gene expression data (2,761 expression arrays) in scikit-learn. Model has 96.94 (+/- 0.20) traning/validation accuracy. The data, code, and supplementary materials can all be found here.

Confusion Matrix: AML_BloodSource_LR-Classification_Traning_on_44754_ProbSets_from_2024_Subjects-Confusion_Matrix

ROC Curve: AML_BloodSource_LR-Classification_Traning_on_44754_ProbSets_from_2024_Subjects-ROC_Curve