We're going to be building Food Vision Big™, using all of the data from the Food101 dataset. This model will try to recognize the food item from its image Yep. All 75,750 training images and 25,250 testing images.
And guess what...
This time we've got the goal of beating DeepFood, a 2016 paper which used a Convolutional Neural Network trained for 2-3 days to achieve 77.4% top-1 accuracy.
🔑 Note: Top-1 accuracy means "accuracy for the top softmax activation value output by the model" (because softmax ouputs a value for every class, but top-1 means only the highest one is evaluated). Top-5 accuracy means "accuracy for the top 5 softmax activation values output by the model", in other words, did the true label appear in the top 5 activation values? Top-5 accuracy scores are usually noticeably higher than top-1.
🍔👁 Food Vision Big™ | |
---|---|
Dataset source | TensorFlow Datasets |
Train data | 75,750 images |
Test data | 25,250 images |
Mixed precision | Yes |
Data loading | Performanant tf.data API |
Target results | 77.4% top-1 accuracy (beat DeepFood paper) (https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/static/bossard_eccv14_food-101.pdf)) |