/paul-parser

The Paul-Parser is used by iUPB to crawl all courses of the University of Paderborn PAUL system.

Primary LanguageRubyOtherNOASSERTION

Paul-Parser

The Paul-Parser is used by iUPB to extract all courses of the University of Paderborn. The code is a bit messy, but it works quite allright. Feel free to fork it and make it bit more modular and more maintainable.

API

If you are only interested in the course data, checkout our course API at dev.yippie.io

Setup

  • Install Ruby 1.9
  • Install Mongodb brew install mongodb
  • Start mongodb under localhost
  • Install dependencies with bundle install
  • Download all courses of the current semester into mongodb with bundle exec ruby crawler.rb
  • In your mongodb, you will find a collection named raw_pages in the database paul
  • Analyse all courses with bundle exec ruby parser.rb
  • Open the collection named courses and do what ever you like with the information
  • to export the found data, simply run mongoexport --db paul --collection courses > courses.json

Contribute

Fork our repository, change, test and then make a pull-request.

License

This is GPL v3 software.