Build and test multiple sci-kit learn machine learning models to predict species from flower measurements.
The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other.
Predicted attribute: class of iris plant. This is an exceedingly simple domain.
Attribute Information:
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
- Species class:
- Iris Setosa
- Iris Versicolour
- Iris Virginica
Following is a list of the Python libraries used for this project:
- sklearn
- scipy
- numpy
- matplotlib
- pandas
I explored the following 7 different algorithms from sci-kit learn:
- Logistic Regression (LR)
- Linear Discriminant Analysis (LDA)
- K-Nearest Neighbors (KNN).
- Classification and Regression Trees (CART).
- Gaussian Naive Bayes (NB).
- Support Vector Machines (SVM).
- Gradient Boosting Classifier(GB).
This is a good mixture of simple linear (LR and LDA), nonlinear (KNN, CART, NB and SVM) algorithms.
- SVM - 97%
- KNN - 100%