This project consists in modeling the Obstructive Sleep Apnea (OSA) using Machine Learning techniques. It was developed in Python using namely Pandas, NumPy, Scikit-learn, Seaborn and Plotly.
My methodology included six steps:
- Study on Specific Knowledge -> OSA symptoms, diagnosis and treatment.
- Data Wrangling -> Feature description and cleansing. Raw data was provided by proffesors beforehand.
- Exploratory Data Analysis (EDA) -> Five types of plots (bar plot, violin plot, box plot, pair plot and heatmap).
- Data Preparation -> feature scaling (standarization and normalization), data transformation (log(x+1) and polynomial features), feature selection (filtering, wrapping and embedded techniques) and dimensionality reduction (PCA and t-SNE).
- Machine Learning Modeling -> For both classification and regression. There were more than 20 models tested: linear regression, logistic regression, SGD, regularizers, naive bayes, kNN, trees, ensemble, SVM, perceptron and MLP.
- Results and model comparison -> Models were compared time, size and accuracy wise.
You can find the full documentation in docs. There are two reports, one for the theorical analaysis and another for the implementation.
Project proposed for the subject: predictive and descriptive learning and machine learning laboratory. ETSIT-UPM.