/streaming_movies

Analysis of the big dataset of movies on Disney+, Amazon Prime, Netflix and Hulu. Filter and visualise e.g. most common genres, runtime, languages and directors.

Primary LanguageJupyter NotebookMIT LicenseMIT

Streaming services: Movies

The most difficult decision is the following one: Which movie we want to see tonight?

There are a lot of streaming provider out there but the 2 biggest one are for sure Prime Video and Netflix.

The dataset I used contains data from 4 streaming provider: Prime Video, Netflix, Hulu and Disney+.

For sure I could also add Sky, Apple or others, but I want to focus on the biggest global provider.

source: Netflix

In this project I want to get an overview of:

  • the numbers of movies on the 4 streaming provider
  • what are the most common genres
  • which runtime -> also important for making the decision for the tonight's movie :-)
  • what are the most common languages
  • which directors are the most common ones
  • which genres are the most common ones in German

SOME INSIGHTS

Genres of movies with the most common word "Love" in its titles

Number of movies on 1 or more streaming services in the dataset

Most common languages of the movies in the dataset


Notebook: ipynb file

Dataset:

  • This dataset is from Kaggle.
  • The new created CSV files are here.

Requirements

pyenv with Python: 3.9.4

wordcloud


Environment

pyenv local 3.9.4

python -m venv .venv

source .venv/bin/activate

pip install --upgrade pip

pip install -r requirements.txt


brew update

brew install node


pip install jupyterlab "ipywidgets>=7.5"

jupyter labextension install jupyterlab-plotly@4.14.3

jupyter labextension install @jupyter-widgets/jupyterlab-manager plotlywidget@4.14.3