/busca_milhas_challenge

Scraper developed for a challenge to scrape lenovo laptops from website

Primary LanguagePython

busca_milhas_Lenovo_challenge

Description

  • Go to webscraper.io and get all Lenovo notebooks sorted from the cheapest to the most expensive.

  • For optimization purposes,the collected data is stored in a JSON file

Getting Started

Framework used

  • Flask

Packages used

  • Selenium
  • JSON
  • Jsonify
  • Render_template

Driver used

Executable driver path needs to be set in [ line 6 ]

getting_all_files.py  ๐Ÿ‘‰๐Ÿฟ  (executable_path=driver_path)

Note๐Ÿ“

The project is divided in 2 phases:

  • Collecting the laptops information
  • Render the page where all the information will be displayed

Running the Project

Git Clone this repository:

git clone https://github.com/wjj28/busca_milhas_challenge.git

CD into the project folder:

cd busca_milhas_challenge

Collect the data and store it as JSON

python import2json.py

Render the webpage to get all the data

python restful_api.py

Objectives Breakdown

  • Collect all the Lenovo laptops' links from the website
  • Collect every single information available for each laptop
  • Sort the laptops by price (from cheapest to the most expensive)
  • Save the results into a JSON file
  • Generate a RESTful API to display the laptops' information in JSON format

What Was Successfully Accomplished

  • Collecting all the Lenovo laptops' info โœ”
  • Collecting the prices for the different HDD of each laptop โœ”
  • Sorting the laptops by price โœ”
  • Saving the results into a JSON file โœ”
  • Generate a RESTful API to display the laptops' information in JSON format โœ”

Built With