This is a machine Learning mode which tries to determine if a person has a diabetes or not.
The dataset is in comma seperated values (.csv) format and is included in teh code.
The following Packages were used
- scikit-learn: To preprocess the data, initiate the model, split the data, cross-validate the data and score the model.
- pandas: To import the dataset, change the dataset into a dataframe and view the data
- seaborn & matplotlib: To visualize the data and to create heat maps to perform feature selection.
Two models were used in this project:
- The KNearestNeaighbor Classifier.
- The Multi-Layer Perceptron Classifier.
- The KNearestNeighborClassifier model recorded an accuracy of 72% on the test data.
- The MultiLayerPerceptronClassifier model recorded an accuracy of 74% on the test data.