Amazon-Books-Crawler

Amazon-Books-Crawler is a python web spyder developed with Scrapy framework. For now, it has only one spider, it scrapes python books from https://www.amazon.com search results. This spyder extracts book title, description, paperback_price, author, star_rate, reviews, img_url, img_path and stores results in a sqlite3 database. Also, data could be stored to a JSON or CSV file with a simple command -- scrapy crawl amazon - file.json. To read logs after crawling, read file named log.txt in main directory.

Dependencies

Python3
Scrapy
SQLite3

Installing Dependencies

virtualenv -p python3 scrapy_books_spyder
cd scrapy_books_spyder
activate it (source bin/activate)
git clone https://github.com/w-e-ll/scrapy-web-spyder.git
cd scrapy-web-spyder
pip install -r requirements.txt
cd amazon
scrapy crawl amazon

made by: https://w-e-ll.com

NoiseControllers/scrapy-web-spyder

Amazon-Books-Crawler

Dependencies

Installing Dependencies