JayYip/m3tl

请教复现问题

18140663659 opened this issue · 8 comments

刚开始使用,下面的代码模型训练好像有问题

import m3tl
from m3tl.preproc_decorator import preprocessing_fn
from m3tl.params import Params
from m3tl.special_tokens import TRAIN
from m3tl.predefined_problems.ner_data import get_weibo_ner_fn

params = m3tl.params.Params()
for problem_type in params.list_available_problem_types():
print('{problem_type}: {desc}'.format(
desc=params.problem_type_desc[problem_type], problem_type=problem_type))

problem_type_dict = {'weibo_ner':'seq_tag'}

processing_fn_dict = {'weibo_ner':get_weibo_ner_fn("data/ner/weiboNER*")}

from m3tl.run_bert_multitask import train_bert_multitask, eval_bert_multitask, predict_bert_multitask
problem = 'weibo_ner'

model = train_bert_multitask(
params=params,
problem=problem,
num_epochs=20,
problem_type_dict=problem_type_dict,
processing_fn_dict=processing_fn_dict,
continue_training=False
)

具体结果如下:
/0: 2.7941
Epoch 3/20
1/11 [=>............................] - ETA: 0s - mean_acc: 1.3923 - weibo_ner_acc: 0.0737 - 2/11 [====>.........................] - ETA: 20s - mean_acc: 1.3904 - weibo_ner_acc: 0.0748 3/11 [=======>......................] - ETA: 18s - mean_acc: 1.4114 - weibo_ner_acc: 0.0706 4/11 [=========>....................] - ETA: 19s - mean_acc: 1.4114 - weibo_ner_acc: 0.0694 5/11 [============>.................] - ETA: 16s - mean_acc: 1.4064 - weibo_ner_acc: 0.0679 6/11 [===============>..............] - ETA: 16s - mean_acc: 1.4166 - weibo_ner_acc: 0.0696 7/11 [==================>...........] - ETA: 16s - mean_acc: 1.4105 - weibo_ner_acc: 0.0715 - BertMultiTaskTop/weibo_ner/losses/0: 8/11 [====================>.........] - ETA: 11s - mean_acc: 1.4148 - weibo_ner_acc: 0.0707 - BertMultiTaskTop/weibo_ner/losses/0: 9/11 [=======================>......] - ETA: 7s - mean_acc: 1.4135 - weibo_ner_acc: 0.0700 - BertMultiTaskTop/weibo_ner/losses/0: 10/11 [==========================>...] - ETA: 3s - mean_acc: 1.4174 - weibo_ner_acc: 0.0703 - BertMultiTaskTop/weibo_ner/losses/0: 11/11 [==============================] - ETA: 0s - mean_acc: 1.4229 - weibo_ner_acc: 0.0699 - BertMultiTaskTop/weibo_ner/losses/0: 11/11 [==============================] - 41s 4s/step - mean_acc: 1.4229 - weibo_ner_acc: 0.0699 - BertMultiTaskTop/weibo_ner/losses/0: 2.7844
Epoch 4/20
1/11 [=>............................] - ETA: 0s - mean_acc: 1.4003 - weibo_ner_acc: 0.0613 - BertMultiTaskTop/weibo_ner/losses/0: 2/11 [====>.........................] - ETA: 16s - mean_acc: 1.4068 - weibo_ner_acc: 0.0607 - BertMultiTaskTop/weibo_ner/losses/0: 3/11 [=======>......................] - ETA: 33s - mean_acc: 1.3949 - weibo_ner_acc: 0.0686 - BertMultiTaskTop/weibo_ner/losses/0: 4/11 [=========>....................] - ETA: 25s - mean_acc: 1.3999 - weibo_ner_acc: 0.0677 - BertMultiTaskTop/weibo_ner/losses/0: 5/11 [============>.................] - ETA: 23s - mean_acc: 1.4015 - weibo_ner_acc: 0.0669 - BertMultiTaskTop/weibo_ner/losses/0: 6/11 [===============>..............] - ETA: 20s - mean_acc: 1.4122 - weibo_ner_acc: 0.0679 - BertMultiTaskTop/weibo_ner/losses/0: 7/11 [==================>...........] - ETA: 19s - mean_acc: 1.4101 - weibo_ner_acc: 0.0682 - BertMultiTaskTop/weibo_ner/losses/0: 8/11 [====================>.........] - ETA: 13s - mean_acc: 1.4176 - weibo_ner_acc: 0.0668 - BertMultiTaskTop/weibo_ner/losses/0: 9/11 [=======================>......] - ETA: 8s - mean_acc: 1.4131 - weibo_ner_acc: 0.0674 - BertMultiTaskTop/weibo_ner/losses/0: 10/11 [==========================>...] - ETA: 4s - mean_acc: 1.4180 - weibo_ner_acc: 0.0673 - BertMultiTaskTop/weibo_ner/losses/0: 11/11 [==============================] - ETA: 0s - mean_acc: 1.4211 - weibo_ner_acc: 0.0680 - BertMultiTaskTop/weibo_ner/losses/0: 11/11 [==============================] - 45s 4s/step - mean_acc: 1.4211 - weibo_ner_acc: 0.0680 - BertMultiTaskTop/weibo_ner/losses/0: 2.7807
Epoch 5/20
1/11 [=>............................] - ETA: 0s - mean_acc: 1.3925 - weibo_ner_acc: 0.0683 - BertMultiTaskTop/weibo_ner/losses/0: 2/11 [====>.........................] - ETA: 9s - mean_acc: 1.4057 - weibo_ner_acc: 0.0677 - BertMultiTaskTop/weibo_ner/losses/0: 3/11 [=======>......................] - ETA: 19s - mean_acc: 1.4285 - weibo_ner_acc: 0.0730 - BertMultiTaskTop/weibo_ner/losses/0: 4/11 [=========>....................] - ETA: 31s - mean_acc: 1.4171 - weibo_ner_acc: 0.0692 - BertMultiTaskTop/weibo_ner/losses/0: 5/11 [============>.................] - ETA: 24s - mean_acc: 1.4222 - weibo_ner_acc: 0.0670 - BertMultiTaskTop/weibo_ner/losses/0: 6/11 [===============>..............] - ETA: 20s - mean_acc: 1.4147 - weibo_ner_acc: 0.0685 - BertMultiTaskTop/weibo_ner/losses/0: 7/11 [==================>...........] - ETA: 18s - mean_acc: 1.4084 - weibo_ner_acc: 0.0697 - BertMultiTaskTop/weibo_ner/losses/0: 8/11 [====================>.........] - ETA: 12s - mean_acc: 1.4074 - weibo_ner_acc: 0.0690 - BertMultiTaskTop/weibo_ner/losses/0: 9/11 [=======================>......] - ETA: 8s - mean_acc: 1.4085 - weibo_ner_acc: 0.0690 - BertMultiTaskTop/weibo_ner/losses/0: 10/11 [==========================>...] - ETA: 3s - mean_acc: 1.4148 - weibo_ner_acc: 0.0690 - BertMultiTaskTop/weibo_ner/losses/0: 11/11 [==============================] - ETA: 0s - mean_acc: 1.4175 - weibo_ner_acc: 0.0700 - BertMultiTaskTop/weibo_ner/losses/0: 11/11 [==============================] - 45s 4s/step - mean_acc: 1.4175 - weibo_ner_acc: 0.0700 - BertMultiTaskTop/weibo_ner/losses/0: 2.7702
Epoch 6/20

没看出来什么问题, 是否没有贴全

没看出来什么问题, 是否没有贴全

训练得acc 一直很低

明白, 我有空看看

I have the same with a low accuracy. It seems using "|" for the problems will have better results than "&".

problem = 'c2_cls|c3_cls|c4_cls|c5_cls|c6_cls'

I have the same with a low accuracy. It seems using "|" for the problems will have better results than "&".

problem = 'c2_cls|c3_cls|c4_cls|c5_cls|c6_cls'

Thanks for the info. Will take a look tomorrow. BTW, are the results significantly better(meaning it can reach expected accuracy) when chaining with "|" or just slightly better?

It is slightly better only. Btw, thank for your hard works but it seems there is a problem with the accuracy. When using your code for my case (offensive language detection), the accuracy gets stuck about 63% while I can obtain 83% with mt-dnn.

In that case, I think there's a bug. Will further investigate.

没看出来什么问题, 是否没有贴全

训练得acc 一直很低

好像无法复现. 我在我自己这边跑是正常的, mean_acc计算有点问题, 但是不影响效果.

Epoch 2/20
48/48 [==============================] - 5s 95ms/step - mean_acc: 0.5634 - weibo_ner_acc: 0.9591 - BertMultiTaskTop/weibo_ner/losses/0: 0.1731
Epoch 3/20
48/48 [==============================] - 4s 74ms/step - mean_acc: 0.5293 - weibo_ner_acc: 0.9703 - BertMultiTaskTop/weibo_ner/losses/0: 0.0904
Epoch 4/20
48/48 [==============================] - 4s 74ms/step - mean_acc: 0.5181 - weibo_ner_acc: 0.9772 - BertMultiTaskTop/weibo_ner/losses/0: 0.0574
Epoch 5/20
48/48 [==============================] - 4s 74ms/step - mean_acc: 0.5119 - weibo_ner_acc: 0.9814 - BertMultiTaskTop/weibo_ner/losses/0: 0.0445
Epoch 6/20
48/48 [==============================] - 4s 73ms/step - mean_acc: 0.5135 - weibo_ner_acc: 0.9845 - BertMultiTaskTop/weibo_ner/losses/0: 0.0412
Epoch 7/20
48/48 [==============================] - 5s 95ms/step - mean_acc: 0.5121 - weibo_ner_acc: 0.9863 - BertMultiTaskTop/weibo_ner/losses/0: 0.0391

不知道是否tf和transformers版本导致.

我测试的版本:

tf.__version__: 2.6.2
transformers.__version__: 4.19.2