/DL-potential-dishes

Identify samples that could potentially be considered as a combination of two dishes from their pictures

Primary LanguageJupyter NotebookMIT LicenseMIT

Deep Learning assignment: Identify potential combinations of dishes

Angel Martinez-Tenor

September 2018 (last maintenance: July 2022)

Github repo

Goal

This repository provides a small application in Python that identifies samples that could potentially be considered as a combination of two dishes given their pictures

Description

Methodology: Build, train and validate several custom and pre-trained convolutional networks. Select the best model (highest validation accuracy) and display and save potential combinations of dishes: those misclassified or with output (sigmoid) ∈ (0.45, 0.55).

Input: Two separated folders with pictures of each class. The example provided here uses a dataset with 402 pictures of sandwiches and 402 pictures of sushi

Only the best model obtained is shown here: MobileNet with input size (224,224) pre-trained with Imagenet with a small fully connected classified trained and tuned for the input dataset.

This implementation is largely influenced and reuses code from the following sources:

How to run the code

Requirements

  • Python 3.7+ (conda environment with Python 3.10 suggested)

Instructions

  1. Clone the repository using git:

    git clone https://github.com/angelmtenor/DL-potential-dishes.git
    
  2. Create a virtual/conda environment (optional):

    conda create -n potential-dishes python=3.10
    conda activate potential-dishes
    
  3. In the folder of the cloned repository, install the dependencies (Numpy, Matplotlib, Seaborn, Pillow, TensorFlow, and Keras):

    cd DL-potential-dishes
    pip install -r requirements.txt
    

    To install tensorflow with GPU support, follow the instructions of this guide: Install TensorFlow GPU.

  4. Run the main script:

    python potential_dishes.py
    

    Tested on both, pure Ubuntu 22 with no GPU and Ubuntu 22 with RTX 2080 on WSL (Windows 11), with similar performance and training time (small dataset with training time ~15s)

Optional:

  • Change constants SOURCE_FILE, DATA_FILE, DATA_DIR, CLASSES on top of the main script to use another dataset with different dishes (2 classes only)

  • Open the notebook example with Jupyter Notebook:

    jupyter notebook potential_dishes.ipynb
    

Analysis of results and & Future Work

The best model obtained, based on transfer learning with a pre-trained MobileNet, achieved accuracies between 89-92% on the validation set. Less than 80% of accuracy was obtained with smaller custom convolutional models without transfer learning.

The generator of the augmented images used to train the classifier is based on the fact that the dishes are usually centered and photographed from different angles.

The identified potential dishes contain both actual potential combination and no combination at all. New potential dishes can be obtained by changing the 'SEED' parameter in the main script (different validation set).

Better accuracies of the classifier can be obtained by training with a large dataset or by fine-tuning the top layers of the pre-trained MobileNet network. However, it is likely that the identification of potential dishes does not improve.

Alternate advanced methods could include Style Transfer or using Generative Adversarial Networks for combining data, as RemixNet.