/Predicting-the-human-difficulty-of-a-question

Creating a classifier that can distinguish the final line of a high school question from the final line of a college question.

Primary LanguagePython

Predicting-the-human-difficulty-of-a-question

Creating a classifier that can distinguish the final line of a high school question from the final line of a college/high school questions.

I have used four models to test out the accuracy for the classification of the last line of the different type of questions.

  1. BERT
  2. ALBERT
  3. Graph Convolutional Neural Network (Transductive nature but good for semi-supervised learning)
  4. GCN - Cheby (Graph Convolutional Networks using Chebyshev Polynomials)
  5. Hybrid GCN + BERT Model

Requirements

  1. pip3 install -r requirements.txt
  2. Download questions from here and save as 'qanta.train.json'. Users can also change the value of the variable 'questions_file' in test_queries.py to the correct path.
  3. Download documents from here and save as 'wiki_lookup.json'. Users can also change the value of the variable 'file_name_documents' in Index_Creation_code.py to the correct path.

Steps to run the code:

  1. Install all the requirements in the file requirements.txt by using the above code.
  2. python run.py (Comment/ Un-comment different lines to run different models)

Results

Model Test Accuracy (in percentage)
BERT 69.8
ALBERT 64.4
GCN 54.27
GCN-Cheby 59.0
Hybrid GCN + BERT 71.9