This was a group project that we did as part of completing a course
This dataset contains 102 classes. Each class has 40 to 258 images. Images are in different variations. Dataset can be downloaded here.
Code for this section can be found in Final_draft_Cnn_flowers.ipynb.
We have tried different approaches with this dataset. A comparison of all the approaches is shown below:
Based on the comparisons from table 1 we decided to take the small as out final model and ran it for
longer time. We ran our model for 50 epochs with early stopping condition. The condition was that if the
validation accuracy does not improve by 0.0001 in 5 epochs. Our training stopped after 23 epochs with
train accuracy of 100% and train loss of 9.8343e-04 and validation accuracy and loss being 53% and 3.54
respectively.
The average test accuracy was 47% and the average over all class specific accuracies was 44%.The
average precision was 0.49.
The confusion matrix and the number of classes per precision number is shown below:
Class specific accuracy and precision is shown below in respective graphs.
Code for this section can be found in Accuracy_per_class_cross_val.ipynb.
We have tried cross validation with 10 folds. We used Scikit-learn’s StratifiedKFold() function to make
sure every class is equally proportionally present in both training and validation datasets. The maximum
and minimum test accuracy was 50% and 31% respectively. Among the 10 test results, the standard
variation was 0.0548 and the average was 47%.
Accuracy from 1st to the 10th fold is visualized below: