This web app scraps the product reviews using BeautifulSoup library and displays the result back to you.
Prerequisites :
The things needed before we start building a python based web scraper are:
- Python installed.
- A Python IDE (Integrated Development Environment): like PyCharm, Spyder, or any other IDE of choice
- Flask Installed.
- MongoDB installed .
- Basic understanding of Python and HTML.
- Basic understanding of Git , download Git CLI
- Download the Heroku CLI
How To Run :
Open Up Any Python IDE :
activate your environment
Install the packages in your environment :
pip install -r requirements.txt
Open MongoDB Compass:
Connect from your local: mongodb://localhost:27017/ to see the reviews you have scrapped (after scraping)
Run :
python (on your terminal or right click and run)
- Enter The Product you want to scrap and the reviews will be displayed and stored in MongoDB so that next time it will scrap from the database itself.
Heroku Deployment
After installing the Heroku CLI, Open a command prompt window and navigate to your ‘Flipkart_ProductReview_Scraping’ folder.
Type the command to login to your heroku account as shown below: :
heroku login
- After logging in to Heroku, enter the command as below to create a heroku app. It will give you the URL of your Heroku app after successful creation. :
heroku create
Before deploying the code to the Heroku cloud, we need to commit the changes to the local git repository.
Type the command as below to initialize a local git repository as shown below: :
git init
git status
git add .
git commit -m "initial commit"
- Enter the command as below to push the code to the heroku cloud.
git push heroku master
After deployment, heroku gives you the URL to hit the web API.
Once your application is deployed successfully, enter the command as below to see the logs.
heroku logs --tail