
The Movie Database (TMDB) is a well-known, open source database for movies and television shows built by its community dating back to 2008. TMDB 5000 Movie Dataset is a batch of this larger set that consists of two movie datasets with metadata on approximately 5,000 movies from TMDb and twenty-two variables that pertain to a film’s plot, cast, crew, budget, and revenues of several thousand films. I embarked on an analysis of what factors contributed to a movie’s overall score and how that score influenced other variables in the dataset so as to build a baseline movie recommendation system. Here is my notebook for this analysis.

