Question Generator

Learning to generate questions from text.
Blog on this project :

Screen Shot


  • Sentence Selection: This module selects topically important sentences from text document.
  • Gap Selection: This module uses Standford Parser extract NP(noun phrase) and ADJP(Adjective Phrase) from important sentences as candidate gaps.
  • Question Formation: This module generate actual questions from the fill in the blank type of question. It uses the NLTK parser and grammar syntax logics for the same.
  • Question Classification: Classify question quality based on pre-trained SVM classifier (Conditional trained only for Blank type questions)


Build Project

git clone
cd Automatic_Question_Generation 
pip install -r requirements.txt

Build Stanford Parser & NER

  • Create a folder to host all the stanford models, e.g. mkdir /your-path-to-stanford-models/stanford-models.
  • Download Stanford Parser at here, unzip, and:
    • Move stanford-parser.jar to stanford models folder, e.g. /your-path-to-stanford-models/stanford-models/stanford-parser.jar
    • Move stanford-parser-x-x-x-models.jar to stanford models folder.
    • Unzip stanford-parser-x-x-x-models.jar, move /edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz to stanford-models/
  • Download Stanford NER at here, unzip, and:
    • Move stanford-ner.jar to stanford models folder.
    • Move stanford-ner-x-x-x.jar to stanford models folder (e.g. 3.7.0).
    • Move /classifiers/english.all.3class.distsim.crf.ser.gz to stanford models folder.

The stanford models folder should looks like this:

- stanford-models/
    | - stanford-parser.jar
    | - stanford-parser-x-x-x-models.jar
    | - englishPCFG.ser.gz
    | - stanford-ner.jar
    | - stanford-ner-x-x-x.jar
    | - english.all.3class.distsim.crf.ser.gz

Environment Variables

Create environment variable file with: touch .env for configuration (in project root).

SENTENCE_RATIO = 0.05 #The threshold of important sentences



Important Variables

ID Variable Name Variable Location USE
1 SENTENCE_RATIO .env file Controls the ratio to sentence selection from given text. Range [0,1]
2 len(entities) > 7 aqg/utils/gap_selection line 58 It elemenates any sentence with more than 7 entities