The project is based on the well known Iris Dataset, with the goal to achieve an efficient classification of Iris flowers according to the 3 known Iris species:
- Setosa
- Versicolor
- Virginica
Based on the attributes of the flowers :
- Length of the petal
- Petal width
- Sepal length
- Sepal width
The project consists of the following steps:
- Dataset analysis:
- Distribution of the different attributes
- Verification of the absence of missing values
- Visualization of the data
- Classification of the data
- Using the Random Forrest algorithm
- Using a decision tree
- Python 3.8 and higher
As well as the following Python packages and modules:
- scikit-learn
- Numpy
- Pandas
- Matplotlib (Pyplot module)
- Seaborn
Algorithm | Accuracy (%) |
---|---|
Random Forrest (3 arbres) | 100 |
Decision Tree | 97.78 |