The goal of tiktokr
is to provide a scraper for the video-sharing
social networking service TikTok. Mostly inspired
by this Python module:
davidteather/TikTok-Api.
You will need Python 3.6 or higher to use tiktokr
.
Many thanks go to Vivien Fabry for creating the hexagon logo.
You can install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("benjaminguinaudeau/tiktokr")
Load library
library(tiktokr)
Make sure to use your preferred Python installation
library(reticulate)
use_python(py_config()$python)
Install necessary Python libraries
tk_install()
In November 2020, Tiktok increased its security protocol. They now frequently show a captcha, which is easily triggered after a few requests. This can be solved by specifying the cookie parameter. To get a cookie session:
- Open a browser and go to “http://tiktok.com”
- Scroll down a bit, to ensure, that you don’t get any captcha
- Open the javascript console (in Chrome: View > Developer > Javascript Console)
- Run
document.cookie
in the console. Copy the entire output (your cookie). - Run
tk_auth()
in R and paste the cookie.
Click on image below for screen recording of how to get your TikTok cookie:
The tk_auth
function will save cookies (and user agent) as environment
variable to your .Renviron
file. You need to only run this once to use
the tiktokr
or whenever you want to update your cookie/user agent.
tk_auth(cookie = "<paste here the output from document.cookie>")
Every time before you run functions you need to initialize tiktokr
tk_init()
Returns a tibble with trends.
# Trend
trends <- tk_posts(scope = "trends", n = 200)
Note: User query often only provides 2k hits but limit is unclear. Sample seems to be from most recent to oldest.
user_posts <- tk_posts(scope = "user", query = "willsmith", n = 50)
Note: Hashtags query only provides 2k hits, which are not drawn randomly or based on the most recent post date but rather some mix of recent and popular TikToks.
hash_post <- tk_posts(scope = "hashtag", query = "maincharacter", n = 100)
With tk_dl_video
you can download videos from TikTok.
From Trends:
trends <- tk_posts(scope = "trends", n = 5)
trends %>%
split(1:nrow(.)) %>%
purrr::walk(~{tk_dl_video(.x$downloadAddr, paste0("video/", .x$id, ".mp4"))})
From hashtag:
hash_post <- tk_posts(scope = "hashtag", query = "maincharacter", n = 5)
hash_post %>%
split(1:nrow(.)) %>%
purrr::walk(~{tk_dl_video(.x$downloadAddr, paste0("video/hashtag/", .x$id, ".mp4"))})