ModelForToeic: A Jupyter Notebook repository from wonwooo

Corret Answer Rate with Only Pre-trained BERTForMaskedLM : 83.8%

After finetuning BERT through the proposed method : 87.8%

This project started by referring to the project of graykode who solved TOEIC Part 5(Sentence with blank problem) with pytorch-pretrained-BERT model(Not finetuned). This project was done to increase the correct answer rate for the TOEIC Part5 problems by finetuning pretrained-BERT.

We collected 6100 Toeic Part5 questions, and used 85% (5185) and 15% (915) questions for training and testing. You can see whole process in this project here.

TOEIC Part 5 : Blank sentence problems

There are two types in TOEIC Part 5 as below:

Type 1 : Grammer

Question : The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.
    a) held
    b) holds
    c) holding
    d) hold

Type 2 : Vocabulary

Question : THe appointment will bring a great deal of [ ? ].
    a) prestige
    b) testimony
    c) willpower
    d) virtuosity

We have to choose best one of the four candidates given in the problem.

1. Pretrained BERT For Masked Language Model

We first measured the performance of the pretrained BERT using the transformer package provided by Huggingface. Here is an example of the problem we used for Test.

{
    '1' :{'question': 'His allergy symptoms _ with the arrival of summer.',
  'answer': 'worsen',
  '1': 'bad',
  '2': 'worse',
  '3': 'worst',
  '4': 'worsen'},

 '2' : {'question': 'He told us that some fans lined up outside of the box office to _ a ticket for the concert.',
  'answer': 'purchase',
  '1': 'achieve',
  '2': 'purchase',
  '3': 'replace',
  '4': 'support'}
}

To solve this blank problems with huggingface's pytorch-pretrained-BERT model, the method suggested by graykode was borrowed.

As a result of the test, Pretrined BertForMaskedLM already showed a correct answer rate of 83.8%. We tried to finetune Pretrained BertForMaskedLM with 5185 training sets. But it was no different from Bert's original pretraining task, so there was no improvement in the correct answer rate for the test problem. The dataset we created to Finetune the MaskedLM model is as follows.

Sentence(X)	Output(Y)
The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.	held (Correct answer)
The appointment will bring a great deal of [ ? ]	prestige (Correct answer)

2. Learning grammatical part specialized for TOEIC Part5.

Therefore, we devised a method to increase the correct answer rate by learning the features of the grammatical part by slightly changing the task. First, a model with a linear layer for binary classification on the pre-trained Bert was used.(huggingface's BertForSequenceClassification)

To finetune the linear classification model, we created four trainig data from one problems like below.

Question : The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.
    a) held
    b) holds
    c) holding
    d) hold

Sentence(X)	Output(Y)
The marketing seminar is being `held` from August 8th through the 11th at Rupp Convention Center.	True(1)
The marketing seminar is being `holds` from August 8th through the 11th at Rupp Convention Center.	False(0)
The marketing seminar is being `holding` from August 8th through the 11th at Rupp Convention Center.	False(0)
The marketing seminar is being `hold` from August 8th through the 11th at Rupp Convention Center.	False(0)

We got 20744 training samples from 5186 questions and tried to train(finetune) BertForSequenceClassification model.

3. Solving Part5 problems with finedtuned BertForSequenceClassification model

To solve Part5 questions with the finetuned Classifier model, we should input each full sentence that blank is filled blank with 4 candidates in question. Our input sentence X and model's output is shown below.

Question : The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.
    a) held
    b) holds
    c) holding
    d) hold

	Input(X)	Output(Y)
X₁	The marketing seminar is being `held` from August 8th through the 11th at Rupp Convention Center.	BertForSeqClassification(X₁) = [logit_True(X₁) , logit_False(X₁)]
X₂	The marketing seminar is being `holds` from August 8th through the 11th at Rupp Convention Center.	BertForSeqClassification(X₂) = logit_True(X₂) , logit_False(X₂)]
X₃	The marketing seminar is being `holding` from August 8th through the 11th at Rupp Convention Center.	BertForSeqClassification(X₃) = logit_True(X₃) , logit_False(X₃)]
X₄	The marketing seminar is being `hold` from August 8th through the 11th at Rupp Convention Center.	BertForSeqClassification(X₄) = logit_True(X₄) , logit_False(X₄)]

If the above X is tokenized and input to the Finetuned BERTForSequenceClassification model, the model outputs the logit about the true and false of sentence X respectively. Finally, we predict candidates with the highest logit_True as correct answer .

wonwooo/ModelForToeic