pre-training
wjczf123 opened this issue · 10 comments
File "/deepo_data/pretrain/code/model.py", line 152, in forward
l_h_state = l_outputs[0][indice, l_ph] # (batch, hidden_size)
result = self.forward(*input, **kwargs)
File "/deepo_data/CZF/Counterfactual-RE/origin/pretrain/code/model.py", line 152, in forward
IndexError: too many indices for tensor of dimension 0
I encounter this bug. Can you provide your pretrained model MTB and CP (like step 2 in pretrain, it seems that MTB and CP in step 2 still needs pre-training).
For the first problem, do you install the version of transformers
in our repo?
For the second problem, the ckpts provided in step 2 in pretrain are the final checkpoints we used.
For the second problem, we use your provided MTB model. But the results seem wrong.
The run.sh is:
ckpt="MTB"
for seed in 42 43 44 45 46
do
bash train.sh 1 $seed $ckpt 0.01 20
done
And I put the MTB into pretrain/ckpt/.
I run this code three times and the results are as follow:
MTB wiki80 Thu Aug 5 06:59:15 2021
@Result: Best Dev score is 0.481, Test score is 0.510
MTB wiki80 Thu Aug 5 07:03:47 2021
@Result: Best Dev score is 0.502, Test score is 0.537
MTB wiki80 Thu Aug 5 07:08:15 2021
@Result: Best Dev score is 0.498, Test score is 0.532
Is there a problem with my operation?
You can look into the train.sh
. 0.01 means proportion of training set. If you want to use normal supervised set, set this to 1.
I know this. But MTB achieves 0.585 F1 with 1% training data and C+M setting. I use this setting. But the results seem wrong.
I can reproduce BERT version. But MTB seems wrong.
Is the mode C+M
? And you can try set max_epoch to 50. Because 20 epoches may not converge.
OK. I will try again. Thank you very much for your help!
One more thing, if the final result is a little different with paper, that's normal. Because 1% set is very sensitive. But the relative improvement is consistent.
OK. Thank you very much!
OK. The results are normal. This method even achieves better results than reported in the paper. Thank you very much.