yxuansu/TaCL

Can not reprpduce the results

leoozy opened this issue · 3 comments

Can not reprpduce the results

Hello, I tired to pretrain the model with your provided scripts. But the results are extremely lower than the reported, even lower than the Bert baseline. I pre-train the model with the following steps:

  1. Firstly, I preprocessed the data and got the english_wiki_20k_lines.txt.
  2. I pretrain the model with the following scripts:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py
--language english
--model_name bert-base-uncased
--train_data ../pretraining_data/english/english_wiki_20k_lines.txt
--number_of_gpu 8
--max_len 256
--batch_size_per_gpu 32
--gradient_accumulation_steps 1
--effective_batch_size 256
--learning_rate 1e-4
--total_steps 150010
--print_every 500
--save_every 10000
--ckpt_save_path ./ckpt/

My environment is :
CUDA 11.2, V100 * 8, transformer 1.7.0, pytorch 1.10

Then I finetune the model on english squad dataset with the following scripts:
#cambridgeltl/tacl-bert-base-uncased
CUDA_VISIBLE_DEVICES=$1 python run_qa.py
--model_name_or_path $2
--dataset_name squad_v2
--do_train
--do_eval
--version_2_with_negative
--per_device_train_batch_size 12
--learning_rate 3e-5
--num_train_epochs 2
--max_seq_length 384
--doc_stride 128
--output_dir $3
My env is transformers 1.15.0. V100 * 1
The model used is the last model trained in the pre-training stage. The results are:
image

The results are extremely lower than the reported on. Could you give me some help ? Is the pre-training script you used one? Thank you!

Hello, I tired to pretrain the model with your provided scripts. But the results are extremely lower than the reported, even lower than the Bert baseline. I pre-train the model with the following steps:

  1. Firstly, I preprocessed the data and got the english_wiki_20k_lines.txt.
  2. I pretrain the model with the following scripts:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py --language english --model_name bert-base-uncased --train_data ../pretraining_data/english/english_wiki_20k_lines.txt --number_of_gpu 8 --max_len 256 --batch_size_per_gpu 32 --gradient_accumulation_steps 1 --effective_batch_size 256 --learning_rate 1e-4 --total_steps 150010 --print_every 500 --save_every 10000 --ckpt_save_path ./ckpt/

My environment is : CUDA 11.2, V100 * 8, transformer 1.7.0, pytorch 1.10

Then I finetune the model on english squad dataset with the following scripts: #cambridgeltl/tacl-bert-base-uncased CUDA_VISIBLE_DEVICES=$1 python run_qa.py --model_name_or_path $2 --dataset_name squad_v2 --do_train --do_eval --version_2_with_negative --per_device_train_batch_size 12 --learning_rate 3e-5 --num_train_epochs 2 --max_seq_length 384 --doc_stride 128 --output_dir $3 My env is transformers 1.15.0. V100 * 1 The model used is the last model trained in the pre-training stage. The results are: image

The results are extremely lower than the reported on. Could you give me some help ? Is the pre-training script you used one? Thank you!

Hi,

(1) Did you first verify the results with our provided checkpoints?
(2) Also, can you train the model using exactly the same environment as we describe in the requirements.txt. We actually found that the version of torch (we use 1.6) could make a big difference on the model performance.
(3) Thirdly, please do not use the english_wiki_20k_lines.txt file, it is just for an
example and used for debugging the code. In our experiment, we use the first 20 million lines of the raw wiki data (should be the first 20 million lines of the file ./english_wiki.txt. It should have a size around 2.5 GB.)

For your reference, below shows the SQuAD results from our released checkpoint.

SQuAD 1.1:
image

SQuAD 2.0:
image

Help the responses can help you.

Feel free to reopen this issue.