/Explore-Data

Explore the Iris dataset

Primary LanguageJupyter Notebook

Explore Data

This project explores the famous Iris dataset, which contains information about different species of iris flowers. The dataset is often used in machine learning and statistics for classification and clustering tasks.

Dataset Description

The Iris dataset contains the following columns:

  • Id: Identifier for each record
  • SepalLengthCm: Sepal length in centimeters
  • SepalWidthCm: Sepal width in centimeters
  • PetalLengthCm: Petal length in centimeters
  • PetalWidthCm: Petal width in centimeters
  • Species: Species of iris flower (setosa, versicolor, virginica)

Questions Explored

  1. What are the different species of iris flowers in the dataset?
  2. What is the average sepal length for each species of iris flower?
  3. What is the maximum petal width among all the iris flowers in the dataset?
  4. How many iris flowers have a sepal length greater than 6.0 cm?
  5. What is the distribution of sepal widths among the iris flowers?

Technologies Used

  • Python
  • pandas

How to set up

  1. Clone the project.
 git clone https://github.com/CyrilBaah/Explore-Data.git
 cd Explore-Data
  1. Create a virtualenv
 virtualenv env
 source env/bin/activate
  1. Install the packages
 pip install jupyter
  1. Run
 jupyter notebook