`tvseries` R data package

The tvseries package contains data from IMDb regarding TV shows ratings, genre, seasons, chapters and other information.

Installation

# First install R package devtools if it is not installed in your machine
# install.packages("devtools")

devtools::install_github("mireia-bioinfo/tvseries")

Usage

I recommend not loading the whole package with library(tvseries) as it contains large datasets and it might freeze your R session.

You can load and assign the different datasets in the package using:

df <- tvseries::tvseries_top100

Datasets

At this point, the package contains three different datasets:

tvseries_must_watch. Dataset containing IMDb ratings for 4 TV series that I personally recommend you should watch! This TV series are: c("Jane the Virgin", "Crazy Ex-Girlfriend", "Brooklyn Nine-Nine", "The Good Place").
tvseries_top100. Dataset containing IMDb ratings for the top 100 TV series selected by their popularity (average number of votes per episode) and their average ranking.
tvseries_top250. Dataset containing IMDb ratings for the top 250 TV series selected by their popularity (average number of votes per episode) and their average ranking.
tvseries. Dataset containing IMDb ratings for all series, seasons and episodes. Be careful when loading this dataset as it is pretty big and might freeze your session.

All these datasets have the same number of columns which are the following:

Name	Description
tvseries_title	Name of the TV series.
episode_title	Name of the episode.
season_number	Season number.
episode_number	Episode number.
start_year	Year the show started.
end_year	Year the show ended.
runtime_min	Length of the episode in minutes.
genres	Genres of the TV series.
average_rating	Average rating of the episode.
votes	Number of votes for that episode.

Source data was obtained fromm IMDb datasets (a description of all the files and columns can be found here

For more information on how data was processed see data_prep.R.

mireia-bioinfo/tvseries

tvseries R data package

Installation

Usage

Datasets

`tvseries` R data package