/Exploiting-Food-Embeddings-for-Ingredient-Substitution

Official repository of the paper "Exploiting Food Embeddings for Ingredient Substitution".

Primary LanguagePython

FoodBERT: Exploiting Food Embeddings for Ingredient Substitution

Official repository of the paper "Exploiting Food Embeddings for Ingredient Substitution" (Published at the International Conference on Health Informatics 2021).

Identifying fitting substitutes for cooking ingredients can be beneficial for various goals, such as nutrient optimization, avoiding allergens, or adapting a recipe to personal preferences. In this repository, we present two models for ingredient embeddings, Food2Vec and FoodBERT. Additionally, we combine both approaches with images, resulting in two multimodal representation models. FoodBERT is furthermore used for relation extraction. According to a ground truth based evaluation and a human evaluation, FoodBERT, and especially its multimodal version, is best suited for substitute recommendations in dietary use cases.

Installation:

  1. Clone this repository

    git clone https://github.com/ChantalMP/Exploiting-Food-Embeddings-for-Ingredient-Substitution
    
  2. Install requirements:

    • Python 3.7
    pip install -r requirements.txt
    python -m spacy download en_core_web_lg
    
  3. Download data and models

  4. Optional: Generate data for FoodBERT and RE training

    python -m normalisation.normalize_recipe_instructions
    python -m foodbert.preprocess_instructions
    

    Only for RE training:

    • Sadly, we can not publish the comment data needed for the relation extraction model
    • If you want to train or use the relation extraction model to generate substitutes, you need to scrape comments yourself. The scripts for this are provided as is, but they are not maintained.
    • All scripts can be found in comment_scraping.
  5. Evaluation:

    • We can't make our ground-truth public, but if you want to reproduce our results or compare your own method, it is available upon request.

Usage

  1. Human and ground-truth-based evaluation: see evaluation/README.md
  2. Food2Vec training and substitute generation: see food2vec/README.md
  3. FoodBERT training: see foodbert/README.md
  4. FoodBERT substitute generation: see see foodbert_embeddings/README.md
  5. Generating image embeddings for multimodal approaches: see see multimodal/README.md
  6. Data normalisation: see normalisation/README.md
  7. Relation Extraction training and substitute generation: see relation_extraction/README.md

Colab Examples

Using FoodBERT: Open In Colab
Using Food2Vec: Open In Colab
Using Image Embeddings: Open In Colab
Generate Substitutes - FoodBERT: Open In Colab
Generate Substitutes - Food2Vec: Open In Colab

If you encounter any problems with the code, feel free to contact us at {chantal.pellegrini, ege.oezsoy, monika.wintergerst}[at]tum[dot]de.