Yelp Dataset Challenge: An Exploration Of Steak Reviews Using NLP

Anahita Bahri

Check out my slides here!


I have many obsessions, but there’s this one thing that people associate me with, whether the person is a close friend or an acquaintance who happens to be an Instagram follower: l’entrecote, or steak frites!

As a superfan of steak frites, I’m always on the lookout for my new favorite steak spot. There are many things I take into consideration, like...

  • Do I get fries?
  • Is there sauce involved?
  • How about wine?
  • And, most importantly, how tasty is the steak?

How about those who review steakhouses on Yelp? What factors may influence a user when rating a restaurant? Why would a user give a restaurant 3 stars over 5? Does the service matter? Quality of the food? Perhaps the price?

Throughout this project, I try to uncover trends in the steak realm by exploring the most frequent words per rating (1-5 stars), uncovering various topics for the review text (overall and per rating), extracting words similar to a few of the most frequent words within the reviews, among other techniques.

As someone who is relatively new to the data science realm, I haven’t been able to answer all of these questions just yet. I have, however, most certainly begun tackling these questions.