kvdaniel/IWCT_Weibo_Crawler

This repository designing a sina weibo crawler is dedicated to the research program of IWCT,SJTU

Python

IWCT_Weibo_Crawler

This repository designing a sina weibo crawler is dedicated to the research program of IWCT,SJTU

=======

#Requirements:

Scrapy >= 0.14
redis-py (tested on 2.4.9)
redis server (tested on 2.4-2.6)
BeautifulSoup
pymongo

Installation

$ sudo apt-get install redis-server
$ sudo pip install requirements.txt

IWCT_Weibo_Crawler的功能

微博模拟登录
分布式/多线程抓取框架
抓取任务接口(用户资料/朋友网/微博内容等)
页面内容解析
数据存储（Redis/MongoDB）

IWCT_Weibo_Crawler Provides

WEIBO Login Simulator
Distributed/Multi-Threading Extraction Framework
Extraction Task Interface(user profile/social network/weibos etc.)
Weibo Page Parser
Data Storage(Redis/MongoDB)

How to Use IWCT_Weibo_Crawler

run command **$ scrapy crawl weibospider ** on your console
under current directory