Product-Info-Crawler is a python web crawler developed using scrapy framework. It has four spiders for crawling the search results from olx.in, amazon.in, ebay.in and shopclues.com. The crawler extract the product names, price, image urls, product urls and source and stores them in a csv file named results.csv
. It can be useful for comparing the price of a particular product between different e-commerce websites.
- Install Python2.7 (Download from https://www.python.org/downloads/)
- Install pip (Follow instructions here https://pip.pypa.io/en/stable/installing/)
- Install scrapy using
pip install scrapy
- Install flask using
pip install flask
- Install pip using
sudo apt-get install python-pip
- Install scrapy using
sudo pip install scrapy
- Install flask using
sudo pip install flask
- Open command line
- Go to root directory i.e. Product-Info-Crawler
- Run
python run_crawler.py
- Enter the search keyword (a product or brand name) in command line.
- See the crawling results in
results.csv
file
- Open Terminal
- Go to demo directory i.e. Product-Info-Crawler/demo
- Run
python run.py
- Open http://127.0.0.1:5000/ in browser window
- Enter the search keyword and click search button
- The products found are displayed with images, price and source info
Feel free to post issues if you find any problem or contact me Aishwarya Mittal
©MIT License