Develop an application that predicts the risk of a heart attack in a patient using various machine learning algorithms. The project involves comparing different models to determine which provides the best accuracy and then employing AutoML library EvalML for further prediction enhancements.
Initial exploration and understanding of the dataset, including data types, distributions, and relationships.
Enhancing the dataset by creating new features or modifying existing ones to improve model performance.
Normalizing the dataset to ensure that features contribute equally to the model.
Training multiple machine learning models to predict the risk of a heart attack.
Evaluating the models to see which performs best on the given dataset. #Dataset Features:
Age: Age of the patient. Sex: Sex of the patient. exang: Exercise-induced angina (1 = yes; 0 = no). ca: Number of major vessels (0-3). cp: Chest pain type: 0: Typical angina 1: Atypical angina 2: Non-anginal pain 3: Asymptomatic trtbps: Resting blood pressure (in mm Hg). chol: Cholesterol in mg/dl fetched via BMI sensor. fbs: Fasting blood sugar > 120 mg/dl (1 = true; 0 = false). rest_ecg: Resting electrocardiographic results: 0: Normal 1: ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV) 2: Probable or definite left ventricular hypertrophy by Estes' criteria thalach: Maximum heart rate achieved. target: 0 = less chance of heart attack, 1 = more chance of heart attack.
Initial model achieved an accuracy of ~80%.
A function was implemented to select the optimal value of k for maximum accuracy.
Achieved an accuracy of 80%.
Implemented as a boosting technique using multiple models for better accuracy. Initial performance was 50% accuracy, leading to hyperparameter tuning using Grid Search CV.
Applied to the top 3 performing algorithms for hyperparameter tuning, resulting in an accuracy of 81%.
Logistic Regression emerged as the best model without hyperparameter tuning, achieving an accuracy of 85.7%.
The final model was serialized using Python's pickle library for future deployment.
EvalML, an open-source AutoML library, was employed to automate and evaluate different machine learning pipelines for optimal performance on the dataset. Outcome: Successfully developed a predictive model with 85.7% accuracy in identifying individuals at risk of a heart attack.