Ever since I remember, IMDB was my go to place to know anything and everything about the movies. I owe my wierd taste in movies to IMDB and without the site, I would have missed out on some trully beautiful gems.
When I saw a data-set for IMDB's top 5000 movies on Kaggle, I knew that I had to perform data mining to gain insights.
The dataset comprises of over 5000 movies not just from Hollywood but from around the world. It has financial information about the movies, the cast and directors and the corresponding IMDB Rank. Do note, that the dataset is in no way comprehensive. Nonetheless it is sufficiently big to pique my interest. If you would like to know more, then check it here
- What are the most frequently used Plot Key words, Movie Title and Genres ?
- What is the trend for the Gross Revenue and Budget of the movies in nominal terms as well as inflation adjusted terms over 100 years ?
- Who are the top actors, directors and movies for each of the past 10 decades ?
To Dive into the Data Exploration and Code, please check my R-Notebook