This GitHub repository contains the iPython files for the analysis of more than 7,000 recipe articles (between 2009 and 2019) webscraped from The Guardian website. You can view the most recent published recipes over here
The analysis of this incredible large food recipes dataset is summarised in two blogposts published on Towards Data Science on Medium.
- Part 1: Exploratoray Data Analysis give an overview of the most popular chefs and categories of recipes for each of the year.
- Part 2: Topic Modeling Explore the use of Natural Language Processing, the use of machine learning algorithms to analyse and retrieve information form textual data.
Hope you will enjoy reading the articles as well as dive into the code to learn more about EDA and NLP techniques.