This project started by referring to the project of graykode who solved TOEIC Part 5(Sentence with blank problem) with pytorch-pretrained-BERT model(Not finetuned). This project was done to increase the correct answer rate for the TOEIC Part5 problems by finetuning pretrained-BERT.
We collected 6100 Toeic Part5 questions, and used 85% (5185) and 15% (915) questions for training and testing. You can see whole process in this project here.
There are two types in TOEIC Part 5 as below:
Question : The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.
a) held
b) holds
c) holding
d) hold
Question : THe appointment will bring a great deal of [ ? ].
a) prestige
b) testimony
c) willpower
d) virtuosity
We have to choose best one of the four candidates given in the problem.
We first measured the performance of the pretrained BERT using the transformer package provided by Huggingface. Here is an example of the problem we used for Test.
{
'1' :{'question': 'His allergy symptoms _ with the arrival of summer.',
'answer': 'worsen',
'1': 'bad',
'2': 'worse',
'3': 'worst',
'4': 'worsen'},
'2' : {'question': 'He told us that some fans lined up outside of the box office to _ a ticket for the concert.',
'answer': 'purchase',
'1': 'achieve',
'2': 'purchase',
'3': 'replace',
'4': 'support'}
}
To solve this blank problems with huggingface's pytorch-pretrained-BERT model, the method suggested by graykode was borrowed.
As a result of the test, Pretrined BertForMaskedLM already showed a correct answer rate of 83.8%. We tried to finetune Pretrained BertForMaskedLM with 5185 training sets. But it was no different from Bert's original pretraining task, so there was no improvement in the correct answer rate for the test problem. The dataset we created to Finetune the MaskedLM model is as follows.
Sentence(X) | Output(Y) |
---|---|
The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center. | held (Correct answer) |
The appointment will bring a great deal of [ ? ] | prestige (Correct answer) |
Therefore, we devised a method to increase the correct answer rate by learning the features of the grammatical part by slightly changing the task. First, a model with a linear layer for binary classification on the pre-trained Bert was used.(huggingface's BertForSequenceClassification)
To finetune the linear classification model, we created four trainig data from one problems like below.
Question : The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.
a) held
b) holds
c) holding
d) hold
Sentence(X) | Output(Y) |
---|---|
The marketing seminar is being held from August 8th through the 11th at Rupp Convention Center. |
True(1) |
The marketing seminar is being holds from August 8th through the 11th at Rupp Convention Center. |
False(0) |
The marketing seminar is being holding from August 8th through the 11th at Rupp Convention Center. |
False(0) |
The marketing seminar is being hold from August 8th through the 11th at Rupp Convention Center. |
False(0) |
We got 20744 training samples from 5186 questions and tried to train(finetune) BertForSequenceClassification model.
To solve Part5 questions with the finetuned Classifier model, we should input each full sentence that blank is filled blank with 4 candidates in question. Our input sentence X and model's output is shown below.
Question : The marketing seminar is being [ ? ] from August 8th through the 11th at Rupp Convention Center.
a) held
b) holds
c) holding
d) hold
Input(X) | Output(Y) | |
---|---|---|
X1 | The marketing seminar is being held from August 8th through the 11th at Rupp Convention Center. |
BertForSeqClassification(X1) = [logitTrue(X1) , logitFalse(X1)] |
X2 | The marketing seminar is being holds from August 8th through the 11th at Rupp Convention Center. |
BertForSeqClassification(X2) = logitTrue(X2) , logitFalse(X2)] |
X3 | The marketing seminar is being holding from August 8th through the 11th at Rupp Convention Center. |
BertForSeqClassification(X3) = logitTrue(X3) , logitFalse(X3)] |
X4 | The marketing seminar is being hold from August 8th through the 11th at Rupp Convention Center. |
BertForSeqClassification(X4) = logitTrue(X4) , logitFalse(X4)] |
If the above X is tokenized and input to the Finetuned BERTForSequenceClassification model, the model outputs the logit about the true and false of sentence X respectively. Finally, we predict candidates with the highest logitTrue as correct answer .