/HackerNews-Scraper

A simple web scraper to scrape the Hacker News(HN) website for news at https://news.ycombinator.com

Primary LanguagePythonMIT LicenseMIT

HackerNews-Scraper

A simple web scraper to scrape the Hacker News(HN) website for news at https://news.ycombinator.com

Parameters:

pages: Number of pages one wants the HackerNews for, this creates one file for each page, and a maximum of only 20 pages can be fetched for now.

verbose: Enable or disable verbose output by Y/N, if Y, then progress is printed to the terminal when each page is fetched, else, the program runs silently.

First, please install the dependencies for this scraper by using the requirements.txt file

pip install -r requirements.txt

To use this for your daily share of HackerNews headlines, please clone and use the HackerNews.py file

git clone https://github.com/Bharat123rox/HackerNews-Scraper.git

Future Scope:

  • Add support to extract a small snippet/preview of text from each article
  • Add Multiprocess support in future, making it as an optional argument
  • Add support to fetch more pages

Any contributions to this Project are always Welcome!!