/Food_Recipes_Document_Clustering

This repository hosts an unsupervised model for Document Clustering of food recipes.

Primary LanguageJupyter Notebook

Food_Recipes_Document_Clustering

This repo hosts an Unsupervised Machine Learning model for Document Clustering on Kaggle's Food.com recipes corpus. This corpus contains 180K+ recipes and 700K+ recipe reviews, however in this analysis I will only focus on recipes and not on recipe reviews.

Note: The data set is not included in this repo due to size limitations. Make sure to dowanload the data (RAW_recipes.csv) from Kaggle and put it in the project root.

The goal is to extract main categories or groupings of recipes based on their names.

You can also check the Kaggle Kernel for this notebook here.

Dependencies

If you want to reproduce the report (Analysis.ipynb) on your local, simply run below commands in terminal to create the environment for running the notebook.

conda env create -f environment.yaml
conda activate env