Searching for professors from given fields is easier than you thought. Reverse search enables to find professors from the field, say you need a list of all the professors working in nanomaterials. What's next? Keep reading.
The repository holds the source for the crawler to generate data for mcmp Professor Reverse Search
.
Prerequisites :
Install Scrapy
. Follow Scrapy Install.
pip install scrapy
- Clone the repo as :
git clone https://github.com/metakgp/mcmp.git
- Checkout to the source code branch :
git checkout source
cd reverse
- To run the crawler with spider-config as in
rsspider.py
and store data as json, use :
scrapy crawl reverse -o prof_details.json -t json
Please use issues page to report any bugs or file feature requests.
We love PR's. Before we begin developing, a short note:
-
scrapy
will scrape items written inreverse/reverse/items.py
-
find scraping logic (= rules for scraping from html dump) in
reverse/reverse/spiders/rsspiders.py
-
departments.json
contains list of all departments. -
prof_details.json
usesprofessor.json
to populate itself. (Itself, huh? Oh yes, yes you do it. Code is just a manifestation of your brilliant mind. :D )
makelists.py
processes prof_details.json
and creates finallist.json
which
is then used to create the fuzzy search, search box here.
If you want to modify the webpage for mcmp, checkout the gh-pages
branch. More about github pages.