
Build and test multiple sci-kit learn machine learning models to predict species from flower measurements.

Primary LanguageJupyter Notebook

Basic Predictive Modeling on Iris Data-set


Build and test multiple sci-kit learn machine learning models to predict species from flower measurements.


The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other.

Predicted attribute: class of iris plant. This is an exceedingly simple domain.

Attribute Information:

  1. sepal length in cm
  2. sepal width in cm
  3. petal length in cm
  4. petal width in cm
  5. Species class:
  • Iris Setosa
  • Iris Versicolour
  • Iris Virginica

Technologies and Libraries

Following is a list of the Python libraries used for this project:

- sklearn
- scipy
- numpy
- matplotlib
- pandas

Machine Learning Algorithms

I explored the following 7 different algorithms from sci-kit learn:

  • Logistic Regression (LR)
  • Linear Discriminant Analysis (LDA)
  • K-Nearest Neighbors (KNN).
  • Classification and Regression Trees (CART).
  • Gaussian Naive Bayes (NB).
  • Support Vector Machines (SVM).
  • Gradient Boosting Classifier(GB).

This is a good mixture of simple linear (LR and LDA), nonlinear (KNN, CART, NB and SVM) algorithms.


  • SVM - 97%
  • KNN - 100%