blog-crawler: A Python repository from younghz

###function Crawl csdn blog of mine from start to end.

###features of ban

rotate useragent --> useragentmiddleware.py
rotate proxy ip --> proxy.py

###Reusable components

useragentmiddleware.py
proxy.py

###TODO When the ip is useless, scrapy will retry with other ip automaticly. However it is a waste of time so much.
So it is importent to choice correct proxy ips. But the best method is to capture the err and do sth. It is the things to do.

younghz/blog-crawler