Reddit Scraper for DataCamp Tutorial

Introduction

This is a working example of a web scraper written with Python and BeautifulSoup 4, which was written for to accompany a tutorial written for DataCamp. The scraper extracts information (title, author, likes, comments) of the first 1000 posts in a specified subreddit. The default subreddit is r/datascience. You can find the tutorial here: https://www.datacamp.com/community/tutorials/scraping-reddit-python-scrapy

Setup

All the libraries used in this example can be installed using pip with the requirements.txt file included. Open any terminal or command prompt and type in the following line.

pip install -r requirements.txt

Using the Scraper

Since this is an accompaniment to a tutorial, there won't be a full description of the code here. You can run the script and see the results of the script by using python reddit_scraper.py.

arkdev9/reddit-scraper-datacamp-tutorial

Reddit Scraper for DataCamp Tutorial

Introduction

Setup

Using the Scraper