Seems that F1 score of self trained model based on bert-base-uncased is unresonable

Question

Seems that F1 score of self trained model based on bert-base-uncased is unresonable

yangheng95 opened this issue 5 years ago · 2 comments

Hello, thank you for your great work!

The F1 score can reach a high level mentioned in this repo by the experiment branch. However, when I tried to train the model based on bert-base-uncased model, it only gets approximately 0.81 F1 score that seems unreasonable. How can I reach the promising F1 score by self-training?

Hope for your reply, kind regards.

Answer 1 · 2020-01-06T08:29:38.000Z

@yangheng95
use the branch dev

Answer 2 · 2020-01-06T12:57:12.000Z

Hello, thank you for your reply. I use the dev branch to conduct another training. The results are as follows, but still not good enough.

           precision    recall  f1-score   support

     MISC     0.6113    0.6909    0.6487       922
      PER     0.8609    0.8605    0.8607      1842
      ORG     0.7292    0.7651    0.7467      1341
      LOC     0.8175    0.8073    0.8124      1837

micro avg     0.7751    0.7962    0.7855      5942
macro avg     0.7791    0.7962    0.7871      5942

This is my parameter setting. Is there any error in my parameter configuration?
Hope for your assistance.

01/06/2020` 17:59:41 - INFO - __main__ -   >>> data_dir: data
01/06/2020 17:59:41 - INFO - __main__ -   >>> bert_model: bert-base-uncased
01/06/2020 17:59:41 - INFO - __main__ -   >>> task_name: ner
01/06/2020 17:59:41 - INFO - __main__ -   >>> output_dir: output
01/06/2020 17:59:41 - INFO - __main__ -   >>> cache_dir: 
01/06/2020 17:59:41 - INFO - __main__ -   >>> max_seq_length: 128
01/06/2020 17:59:41 - INFO - __main__ -   >>> do_train: True
01/06/2020 17:59:41 - INFO - __main__ -   >>> do_eval: True
01/06/2020 17:59:41 - INFO - __main__ -   >>> eval_on: dev
01/06/2020 17:59:41 - INFO - __main__ -   >>> do_lower_case: False
01/06/2020 17:59:41 - INFO - __main__ -   >>> train_batch_size: 32
01/06/2020 17:59:41 - INFO - __main__ -   >>> eval_batch_size: 8
01/06/2020 17:59:41 - INFO - __main__ -   >>> learning_rate: 5e-05
01/06/2020 17:59:41 - INFO - __main__ -   >>> num_train_epochs: 3.0
01/06/2020 17:59:41 - INFO - __main__ -   >>> warmup_proportion: 0.1
01/06/2020 17:59:41 - INFO - __main__ -   >>> weight_decay: 0.01
01/06/2020 17:59:41 - INFO - __main__ -   >>> adam_epsilon: 1e-08
01/06/2020 17:59:41 - INFO - __main__ -   >>> max_grad_norm: 1.0
01/06/2020 17:59:41 - INFO - __main__ -   >>> no_cuda: False
01/06/2020 17:59:41 - INFO - __main__ -   >>> local_rank: -1
01/06/2020 17:59:41 - INFO - __main__ -   >>> seed: 42
01/06/2020 17:59:41 - INFO - __main__ -   >>> gradient_accumulation_steps: 1
01/06/2020 17:59:41 - INFO - __main__ -   >>> fp16: False
01/06/2020 17:59:41 - INFO - __main__ -   >>> fp16_opt_level: O1
01/06/2020 17:59:41 - INFO - __main__ -   >>> loss_scale: 0
01/06/2020 17:59:41 - INFO - __main__ -   >>> server_ip: 
01/06/2020 17:59:41 - INFO - __main__ -   >>> server_port: 
01/06/2020 17:59:41 - INFO - __main__ -   device: cuda n_gpu: 2, distributed training: False, 16-bits training: False