The package imdb helps you in downloading series and movie information from imdb. It has three functions one for basic information about series and a second one that also downloads synopsis, actors etc. A third function downloads information about movies.
For now you will have to install using devtools::install_github("rmhogervorst/imdb")
imdb has 2 functions:
- imdbSeries()
- enrichIMDB() and
- imdbMovies()
With the function 'imdbSeries(seriesname = "name of series", seasons = number(s))' you can call up general information about series. Note that the api is does not really care about case. "Game of Thrones" or "game of thrones" or "gAmE oF tHrONes " is all fine.
library(imdb)
imdbSeries("game of thrones ")
#> Title Released Episode imdbRating
#> 1 Winter Is Coming 2011-04-17 1 9.0
#> 2 The Kingsroad 2011-04-24 2 8.8
#> 3 Lord Snow 2011-05-01 3 8.7
#> 4 Cripples, Bastards, and Broken Things 2011-05-08 4 8.7
#> 5 The Wolf and the Lion 2011-05-15 5 9.1
#> 6 A Golden Crown 2011-05-22 6 9.2
#> 7 You Win or You Die 2011-05-29 7 9.2
#> 8 The Pointy End 2011-06-05 8 9.0
#> 9 Baelor 2011-06-12 9 9.6
#> 10 Fire and Blood 2011-06-19 10 9.4
#> imdbID Season
#> 1 tt1480055 1
#> 2 tt1668746 1
#> 3 tt1829962 1
#> 4 tt1829963 1
#> 5 tt1829964 1
#> 6 tt1837862 1
#> 7 tt1837863 1
#> 8 tt1837864 1
#> 9 tt1851398 1
#> 10 tt1851397 1
The command will return a data frame with Title, releasedate, episodenumber imdb-rating, imdb ID and season.
Would you like to know more about your series? Use the enrichIMDB
command:
season2GOT <-imdbSeries("game of thrones", seasons = 2)
season2GOT_enriched <- enrichIMDB(season2GOT)
The enrichIMDB command returns a seperate dataframe with imdbID, runtime, director, writer, actors, plot (complete synopsis), and votes per episode. It uses the imdbid of the episode to scour for more information. So if you'd like to know how many times Jon Snow appears in the synopsis, or how many times Peter Dinklage plays in season 2, you can now search for it.
grep("Jon", season2GOT_enriched$plot)
#> [1] 2 3 6 7 8 10
grep("Peter Dinklage", season2GOT_enriched$actors)
#> [1] 1 2 3 4 5 6 7 8 9 10
Combining the information from the two dataframes can also be very useful.
library(ggplot2)
suppressPackageStartupMessages(library(dplyr))
GOTall<-imdbSeries("game of Thrones", 1:6)
GOT <-left_join(GOTall, enrichIMDB(GOTall), by = "imdbID")
ggplot(GOT, aes(Episode, imdbRating)) +
geom_smooth(aes(color = as.factor(Season)),se = FALSE , alpha = 1/10)+
geom_point(aes(color = as.factor(Season), size = votes))+
ggtitle("Rating per episode of GoT, \ncolored by season\nwith smoothlines")
#> `geom_smooth()` using method = 'loess'
I'm always looking for people to help me improve my work. Contact me directly, use an issue, fork me or submit a pull request.