Marketplace Web-Crawler
- Use Products API to get JSON List of the products
- Use Products JSON List to crawl Products Specifications from HTML pages
- Use Products JSON List to request Reviews API for every product
- Clean the collected JSON files
- Extract valuable information from Product Specifications
- Dump data into the database
I had to do some stuff with my outgoing traffic to find out its endpoints.
Therefore, I think it is not tethical to put it online.
- Python 3.8+
- Docker - optional
- Get the project
git clone https://github.com/zhanymkanov/reviews_parser
2a. Install the packages without docker
pip install -r requirements/base.txt
2b. Install the packages with docker
docker-compose up -d --build