hunghinh2000/tinhtevn-crawler
This is the project that crawl all threads of a specific category in Tinhte.vn based on Scrapy library.
Python
Written by: hunghinh2000
Input: the link of category you need to crawl on Tinhte.vn. You can change the link in ./config/main.cfg
Output: image folder contains all images of each thread and json folder contains information of image.
python >= 3.5
Run this command to install requirements:
pip3 install -r requirements.txt
Change system config in ./config/main.cfg to suit your enviroments.
python3 main.py