What does this thing do?

Given a Glassdoor url, this will extract interview questions (and answers).

Requirements

$ scrapy crawl -a url=<< glassdoor url >>

Note: The url should have this form: https://www.glassdoor.com/Interview/*.html.

Results will be stored in /tmp/output.json'.

Output example:

{
  "questions": "...",
  "role": "...",
  "answers": [...],
  "answers_url": "..."
}