A Crawler for fetching discount data
- python 2.7
- scrapy 0.24
- scrapyd
- Make sure you have successfully installed
python2.7
,scrapy
andscrapyd
- In your terminal, use the command
scrapy crawl <spider_name>
to crawl data - Available spider_name:
jd
(for data on jd.com),smzdm
(for data on smzdm.com) - To ensure the data is up-to-date, you may use the
crontab
command in your Unix-like system to run the command periodically. For example, you can create a file namedperiod_task
, and write0 */2 * * * scrapy crawl jd
in it, then typesudo crontab period_task
in your terminal to activate it. You can refer to the manual page ofcrontab
to get more details. - The crawled data will be created under the project folder as a json file.