/Flipkart_ProductReview_Scraping

This web app does the web scraping of reviews from Flipkart Website.

Primary LanguagePython

Flipkart_ProductReview_Scraping

Overview

This web app scraps the product reviews using BeautifulSoup library and displays the result back to you.

Prerequisites :

The things needed before we start building a python based web scraper are:

  • Python installed.
  • A Python IDE (Integrated Development Environment): like PyCharm, Spyder, or any other IDE of choice
  • Flask Installed.
  • MongoDB installed .
  • Basic understanding of Python and HTML.
  • Basic understanding of Git , download Git CLI
  • Download the Heroku CLI

How To Run :

Open Up Any Python IDE :

activate your environment 

Install the packages in your environment :

pip install -r requirements.txt

Open MongoDB Compass:

Connect from your local: mongodb://localhost:27017/ to see the reviews you have scrapped (after scraping)

Run app.py :

python app.py (on your terminal or right click and run)
  • Enter The Product you want to scrap and the reviews will be displayed and stored in MongoDB so that next time it will scrap from the database itself.

Heroku Deployment

  • After installing the Heroku CLI, Open a command prompt window and navigate to your ‘Flipkart_ProductReview_Scraping’ folder.

  • Type the command to login to your heroku account as shown below: :

heroku login
  • After logging in to Heroku, enter the command as below to create a heroku app. It will give you the URL of your Heroku app after successful creation. :
heroku create 
  • Before deploying the code to the Heroku cloud, we need to commit the changes to the local git repository.

  • Type the command as below to initialize a local git repository as shown below: :

git init 
git status
git add .
git commit -m "initial commit" 
  • Enter the command as below to push the code to the heroku cloud.
git push heroku master
  • After deployment, heroku gives you the URL to hit the web API.

  • Once your application is deployed successfully, enter the command as below to see the logs.

heroku logs --tail