/scrapers

Useful web scrapers using R and Python

Primary LanguagePythonMIT LicenseMIT

This repo contains useful web scrapers using Beautiful Soup in Python and utils in R.

mpsontwitter_scraper scrapes the total number of interactions (likes + retweets) from all UK MP tweets in a given data range. Results can be viewed either per tweet or aggregated over each MP. (Run-time for the year 2016 is approx 6 hours).

nips_scraper enables scraping of titles and abstracts of every NIPS paper published in a given year, as well as downloading the papers as pdfs and citations as txt.

imdb_scraper uses R to download IMDB datasets to a given folder as tsvs.