/news-scrapper-backend

python script to scrape stuff, rnz & scoop for latest news 📡🖥🖱

Primary LanguagePythonMIT LicenseMIT

news-scrapper-backend

Script Cron Job on AWS EC2 to populate AWS MySQL RDS instance to support daily-roundup.netlify.app website


Purpose

Purpose of this api is to scrape the latest headlines from stuff, RadioNZ & Scoop and store it in a AWS RDS instance so that it can be served for react-news-website


How it works

  • Python script created using beautifulsoup4 that scrapes data from stuff, radioNZ & scoop
  • AWS RDS instance is created to store all this data. mysql.connector library is used for this
  • A Cron Job is setup on an AWS EC2 instance to scrap these websites every 15 mins and populate the DB

Architecture Diagram

diagram


Technologies & Libraries used 🚀

  • React
  • AWS RDS
  • AWS EC2
  • Flask API
  • Zappa Library
  • mysql.connector library