/movie-stats

🎬 analyzing three decades of movie data

Primary LanguageJupyter Notebook

movie-stats

Analyzing three decades of movie data.

Context

Is the movie industry dying? is Netflix the new entertainment king? Those were the first questions that lead me to create a dataset focused on movie revenue and analyze it over the last decades. But, why stop there? There are more factors that intervene in this kind of thing, like actors, genres, user ratings and more. And now, anyone with experience (you) can ask specific questions about the movie industry, and get answers.

Content

There are 6820 movies in the dataset (220 movies per year, 1986-2016). Each movie has the following attributes:

  • budget: the budget of a movie. Some movies don't have this, so it appears as 0

  • company: the production company

  • country: country of origin

  • director: the director

  • genre: main genre of the movie.

  • gross: revenue of the movie

  • name: name of the movie

  • rating: rating of the movie (R, PG, etc.)

  • released: release date (YYYY-MM-DD)

  • runtime: duration of the movie

  • score: IMDb user rating

  • votes: number of user votes

  • star: main actor/actress

  • writer: writer of the movie

  • year: year of release

Acknowledgements

This data was scraped from IMDb.

Contribute

You can contribute via GitHub and Kaggle.