Explore Data

This project explores the famous Iris dataset, which contains information about different species of iris flowers. The dataset is often used in machine learning and statistics for classification and clustering tasks.

Dataset Description

The Iris dataset contains the following columns:

Id: Identifier for each record
SepalLengthCm: Sepal length in centimeters
SepalWidthCm: Sepal width in centimeters
PetalLengthCm: Petal length in centimeters
PetalWidthCm: Petal width in centimeters
Species: Species of iris flower (setosa, versicolor, virginica)

Questions Explored

What are the different species of iris flowers in the dataset?
What is the average sepal length for each species of iris flower?
What is the maximum petal width among all the iris flowers in the dataset?
How many iris flowers have a sepal length greater than 6.0 cm?
What is the distribution of sepal widths among the iris flowers?

Technologies Used

Python
pandas

How to set up

Clone the project.

 git clone https://github.com/CyrilBaah/Explore-Data.git

 cd Explore-Data

Create a virtualenv

 virtualenv env
 source env/bin/activate

Install the packages

 pip install jupyter

 jupyter notebook