Creating a classifier that can distinguish the final line of a high school question from the final line of a college/high school questions.
I have used four models to test out the accuracy for the classification of the last line of the different type of questions.
- BERT
- ALBERT
- Graph Convolutional Neural Network (Transductive nature but good for semi-supervised learning)
- GCN - Cheby (Graph Convolutional Networks using Chebyshev Polynomials)
- Hybrid GCN + BERT Model
- pip3 install -r requirements.txt
- Download questions from here and save as 'qanta.train.json'. Users can also change the value of the variable 'questions_file' in test_queries.py to the correct path.
- Download documents from here and save as 'wiki_lookup.json'. Users can also change the value of the variable 'file_name_documents' in Index_Creation_code.py to the correct path.
- Install all the requirements in the file requirements.txt by using the above code.
- python run.py (Comment/ Un-comment different lines to run different models)
Model | Test Accuracy (in percentage) |
---|---|
BERT | 69.8 |
ALBERT | 64.4 |
GCN | 54.27 |
GCN-Cheby | 59.0 |
Hybrid GCN + BERT | 71.9 |