/what-s-cooking

The present work utilizes state-of-the-art text mining and machine learning algorithms to predict the category of cuisine (e.g. Mexican, Italian, Indian etc.) for a given recipe, whose ingredients are input to the model. The model is trained on a large dataset that contains the list of ingredients for several recipes belonging to different cuisine categories, and performs the cuisine category classification based on this learning. 1.1. Objectives & Need The present work aims to enable automated categorizing of the cuisine for any recipe. If the user desires to know the origin of a recipe, or wants to know which section of the menu or website the recipe at hand should be put into, then our model shall enable an objective classification based on the input of ingredients for the recipe. This shall enable restaurants, bloggers, food delivery chains, nutritionists etc. in designing menus, recipe lists and nutrition charts by correct segregation of cuisine categories, enabling customers to make better choices in terms of taste and diet preferences. The present means of classification of recipes is very subjective, since it is always done manually. Globalization has led to fusion of cultures, including food preferences and tastes, and many recipes that were earlier associated with a certain culture, have been modified so extensively to suit the tastes of another culture, and have become so mainstream in that culture, that they should no longer be considered to be belonging to the original culture. An algorithmic approach will therefore do a better job in classifying recipes. We used a variety of techniques on the same dataset and have been able to understand the difference in behaviour of different tools when they are applied on the same natured data set. This enabled us to explore our limits on different models and how they will behave on a certain data. We were able to get different models and satisfactory results through multiple iterations. We also faced issues, in terms of the data set size, as in KNN and SVM did not gave prominent results and had to struggle due to the larger data set.

Primary LanguageR

The present work utilizes state-of-the-art text mining and machine learning algorithms to predict the category of cuisine (e.g. Mexican, Italian, Indian etc.) for a given recipe, whose ingredients are input to the model. The model is trained on a large dataset that contains the list of ingredients for several recipes belonging to different cuisine categories, and performs the cuisine category classification based on this learning.

Objectives & Need

The present work aims to enable automated categorizing of the cuisine for any recipe. If the user desires to know the origin of a recipe, or wants to know which section of the menu or website the recipe at hand should be put into, then our model shall enable an objective classification based on the input of ingredients for the recipe. This shall enable restaurants, bloggers, food delivery chains, nutritionists etc. in designing menus, recipe lists and nutrition charts by correct segregation of cuisine categories, enabling customers to make better choices in terms of taste and diet preferences.

The present means of classification of recipes is very subjective, since it is always done manually. Globalization has led to fusion of cultures, including food preferences and tastes, and many recipes that were earlier associated with a certain culture, have been modified so extensively to suit the tastes of another culture, and have become so mainstream in that culture, that they should no longer be considered to be belonging to the original culture. An algorithmic approach will therefore do a better job in classifying recipes. We used a variety of techniques on the same dataset and have been able to understand the difference in behaviour of different tools when they are applied on the same natured data set. This enabled us to explore our limits on different models and how they will behave on a certain data. We were able to get different models and satisfactory results through multiple iterations. We also faced issues, in terms of the data set size, as in KNN and SVM did not gave prominent results and had to struggle due to the larger data set.