Box-Office-Prediction

This repository contains the dataset used for training an ML model to predict the box office success (revenue) from other parameters, such as Genre, Budget, Popularity, Star Power, etc.

The data was taken from various TMDB datasets found on Kaggle, and then further run through the TMDB and OMDB APIs to fetch data for all features. Some the movies were also extracted from the TMDB Dataset through it's APIs directly.

The dataset has 6065 movies in total, with the following features:

Genres
ID
Original Language
Original Title
Overview
Popularity Rating
Release Date
Title
TMDB Rating
TMDB Vote Count
IMDb ID
Budget
Revenue
Production Companies
Cast
Crew
Production Countries
Spoken Languages
Runtime
Tagline
MPAA Rating
IMDb Rating
IMDb Vote Count
Star Power

The popularity index of the most popular actor in the movie was taken to be the Star Power. The IMDb Ratings and Vote Count was taken from the OMDB API, along with the MPAA rating.

gaurang2001/Box-Office-Prediction

Box-Office-Prediction