/web-scraper

Python web scraper

Primary LanguagePythonMIT LicenseMIT

Python Web Scraper

This is a simple web scraper to download text from a website. The content will be written to two text files in the output folder.

  • Content.txt
  • Headers.txt

Environment File

The environment file contains 3 values

BASE_URL=Base URL of the content
HEADER_SLUGS=Slug containing the headers
CONTENT_SLUGS=Slug containing all the content

IDE

I use PyCharm as my IDE. Really nice UI, similar to Rider. You can also just run the solution from within PyCharm.

Setup

TL;DR

  • Download and install Python3
  • Install required packages using - pip3 install -r requirements.txt
  • Make sure you have a folder called output at the same level as main.py
  • Run the command - python3 main.py

Installation Guides

This guide contains guides for all operating systems.