This dataset aims to predict whether a patient will have diabetes in the subsequent 5 years, based on several variables. These said variables are as follow:
Variable | Description |
---|---|
Outcome | Diabetes = yes (1) or no (0) |
Pregnancies | Number of times pregnant |
Glucose | Plasma glucose concentration in 2 hours in an oral glucose tolerance test |
BloodPressure | Diastolic blood pressure (mmHg) |
SkinThickness | Triceps skin fold thickness (mm) |
BMI | Body mass index |
DiabetesPedigreeFunction | A function that scores the likelihood of diabetes based on family history |
Age | Age in years |
There is a total of 768 samples for this dataset, which are taken from female Pima Indians.
After trial and error, the final model was 91% accurate.