Classification-of-the-102-Category-Flowers-Dataset

This was a group project that we did as part of completing a course

Dataset

This dataset contains 102 classes. Each class has 40 to 258 images. Images are in different variations. Dataset can be downloaded here.

Result

Code for this section can be found in Final_draft_Cnn_flowers.ipynb.
We have tried different approaches with this dataset. A comparison of all the approaches is shown below:
Comparison table
Based on the comparisons from table 1 we decided to take the small as out final model and ran it for longer time. We ran our model for 50 epochs with early stopping condition. The condition was that if the validation accuracy does not improve by 0.0001 in 5 epochs. Our training stopped after 23 epochs with train accuracy of 100% and train loss of 9.8343e-04 and validation accuracy and loss being 53% and 3.54 respectively.
Loss
Accuracy
The average test accuracy was 47% and the average over all class specific accuracies was 44%.The average precision was 0.49.
The confusion matrix and the number of classes per precision number is shown below:
Confusion Matrix
 Number of classes per precision value
Class specific accuracy and precision is shown below in respective graphs.
Accuracy of each class
Precision of each class

Cross Validation Results

Code for this section can be found in Accuracy_per_class_cross_val.ipynb.
We have tried cross validation with 10 folds. We used Scikit-learn’s StratifiedKFold() function to make sure every class is equally proportionally present in both training and validation datasets. The maximum and minimum test accuracy was 50% and 31% respectively. Among the 10 test results, the standard variation was 0.0548 and the average was 47%.
Accuracy from 1st to the 10th fold is visualized below:
cross-val