/Master-Application-Helper

爬虫,可以爬取一亩三分地的所有录取信息 A crawler for admission information in 1point3acres.com

Primary LanguagePython

尚未完成的项目

This project is unfinished, you can see the following work progress in https://github.com/JohnDing1995/Selecting-Master-Program-USA

You can now serach admission cases on http://123.206.99.164/

Requirement

Python 3.6

Scrapy

Pymongo

Usage

  1. Clone this repository

  2. cd to the root of the project, run scrapy crawl ad_information

  3. The data will shown in your mongdb database,like this屏幕快照 2017-03-13 下午10.49.52

  4. Yan can query the database as according to your own need

    eg. To find the admissions of CS master@north Carolina state university(NCSU),you can query like this

    屏幕快照 2017-03-14 上午9.08.03 1

  5. I'm developing a website which provides admission information query service at Here

Current progress

See http://123.206.99.164/

Update log

  • 2017/3/11 add mongdb data storage,

    • bug to fix:some void data didn't added to db, which makes fields don't match with each other
  • 2017/3/13 Rewrite Selector modual with regular expression , now the void data will be stored in the database, and the bug fixed;Optimize the database storage, now each admission information can be stored as a single object

  • 2017/4/10 Save url link of each admission case to db

  • 2017/7/3 Add admission status to db