/Web-App-Scraper_Python

Use Python, MongoDB and HTML to build a Web Application that Scrapes the Latest Internet Articles, Images and Relevant Information

Primary LanguageJupyter Notebook

Web Scraping Live Dashboard with Python and MongoDB

Use Python, MongoDB and HTML to build a Web Application that Scrapes the Latest Internet Articles, Images and Relevant Information

Goals  •  Dataset  •  Tools Used

Goals

We're looking to build a live webapp by scraping the below list of websites. The script we're building is designed to scrape the most recent data. Each time we run the script, we'll pull the newest data available. As long as the website continues to be updated with new articles, we'll have a constant influx of new information at our fingertips. The data Robin wants to collect from this particular website is the most recent news article along with its summary. We plan to achieve this by:

  • using python to pull data from multiple websites,
  • store the scraped data in MongoDB, a NoSQL database,
  • then present the collected data in a central location: a Flask webpage built with HTML.

Dataset

We'll use our web application to scrape news articles and high-quality images, a collection of facts is a solid addition to her web app. from three different websites, store them in MongoDB, and display them in our webpage via Flask

Tools Used

  • Python: Programming language used to build automated auditing solution
    • Splinter: Python tool that will automate our web browser as we begin scraping
    • Beautiful Soup: Python package used for parsing HTML and XML documents
    • Flask: Python tool used for developing web applications
  • MongoDB: NoSQL database used to store unstructrued data, such as images
  • HTML: Hypertext Markup Language used to build and design webpages
  • Jupyter Notebook: Open source web based application used to run our python code

Back to top