/CrawlerMain

A Python/Scrapy/MongoDB based distributed crawler system

Primary LanguagePython

CrawlerMain

This is a distributed, vertical crawler system for a set of websites, including the stock message broad of Eastmoney.com, the news web portal Sina.com.cn, and Social Trading platform Xueqiu.com.

Design

  • Windows platform
  • Python3 + Scrapy
  • Redis for queueing and distributional crawling.
  • MongoDB as data storage system.