bug in run_flue.py

Question

bug in run_flue.py

Closed this issue 5 years ago · 7 comments

Hi I got this error when running the run_flue.py
from transformers import flue_compute_metrics as compute_metrics
Import Error: cannot import name 'flue_compute_metrics'

I already preinstalled the requirements and update the transformer directory.

Answer 1 · 2020-03-30T17:08:41.000Z

Hi @keloemma ,

Please try reinstalling using the following command.

pip install --upgrade --force-reinstall git+https://github.com/formiel/transformers.git@flue

Answer 2 · 2020-03-31T15:41:30.000Z

thanks, I get this error :

Flaubert$ bash finetuning_flue.sh usage: run_flue.py [-h] --data_dir DATA_DIR --model_type MODEL_TYPE --model_name_or_path MODEL_NAME_OR_PATH --task_name TASK_NAME --output_dir OUTPUT_DIR [--config_name CONFIG_NAME] [--tokenizer_name TOKENIZER_NAME] [--cache_dir CACHE_DIR] [--max_seq_length MAX_SEQ_LENGTH] [--do_train] [--do_eval] [--do_lower_case] [--per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE] [--per_gpu_eval_batch_size PER_GPU_EVAL_BATCH_SIZE] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--learning_rate LEARNING_RATE] [--weight_decay WEIGHT_DECAY] [--adam_epsilon ADAM_EPSILON] [--max_grad_norm MAX_GRAD_NORM] [--num_full_passes NUM_FULL_PASSES] [--max_steps MAX_STEPS] [--warmup_steps WARMUP_STEPS] [--no_cuda] [--overwrite_output_dir] [--overwrite_cache] [--seed SEED] [--fp16] [--fp16_opt_level FP16_OPT_LEVEL] [--local_rank LOCAL_RANK] [--server_ip SERVER_IP] [--server_port SERVER_PORT] [--val_metrics {acc,f1,acc_and_f1}] [--early_stopping_patience EARLY_STOPPING_PATIENCE] [--steps_per_epoch STEPS_PER_EPOCH] [--do_test] [--scheduler {constant,constant-warmup,linear-warmup,cosine-w armup,None}] run_flue.py: error: unrecognized arguments: --num_train_epochs 30 --save_steps 5 0000

seems those two arguments are not recognised, so I comment out num_train_epoch in run_flue.py. but there was not --save_steps

So should I remove it (because It is listed in the list of parameters in the flue : evaluation in the example.

Answer 3 · 2020-03-31T22:30:35.000Z

Hi @keloemma ,

There are no num_train_epoch and save_steps in the run_flue.py script. The num_train_epoch parameter was replaced by epochs and the steps_per_epoch parameter was added to control the steps per epoch in case you do not want to pass the whole dataset in one epoch. This also replaced the save_steps parameter as well.

Please refer to the script for more details and descriptions of each parameter. You should modify the running command that you used previously for run_glue.py.

Answer 4 · 2020-04-06T08:45:35.000Z

Hello @formiel

I got another problem when running the script :

04/06/2020 10:44:08 - WARNING - __main__ - Process rank: -1, device: cuda, n_gpu: 2, distributed training: False, 16-bits training: True Traceback (most recent call last): File "/home/transformers/examples/run_flue.py", line 795, in <module> main() File "/home/transformers/examples/run_flue.py", line 750, in main cache_dir=args.cache_dir if args.cache_dir else None, File "/home/anaconda3/envs/env/lib/python3.6/site-packages/transformers/configuration_utils.py", line 188, in from_pretrained config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs) File "/home/anaconda3/envs/env/lib/python3.6/site-packages/transformers/configuration_utils.py", line 240, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "/home/anaconda3/envs/env/lib/python3.6/site-packages/transformers/configuration_utils.py", line 329, in _dict_from_json_file text = reader.read() File "/home/anaconda3/envs/env/lib/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
This is the command I do when launching :

python ~/eXP/Flaubert/transformers/examples/run_flue.py \ --data_dir $data_dir \ --model_type flaubert \ --model_name_or_path $model_name_or_path \ # best-*.pth --task_name $task_name \ --output_dir $output_dir \ --max_seq_length 512 \ --do_train \ --do_eval \ --max_steps $epochs \ # In the script, it says this parameter overwrite num_train_epochs --learning_rate $lr \ --fp16 \ --fp16_opt_level O1 \ |& tee output.log

when using on another server I got this error in the output log file

File "/home/transformers/run_flue.py", line 357 print(json.dumps({**logs, **{"step": global_step}})) ^ SyntaxError: invalid syntax

Do you know perhaps how can I solve it ?

Answer 5 · 2020-04-08T15:53:49.000Z

Hi @keloemma,

Can you please replace the file run_flue.py by the new version that I just pushed and try running the code again?

Answer 6 · 2020-04-23T10:16:32.000Z

I reload and retry but still got the same error ?
04/23/2020` 10:19:32 - WARNING - __main__ - Process rank: -1, device: cuda, n_g, 16-bits training: True Traceback (most recent call last): File "transformers/examples/run_flue.py", line 782, in <module> main() File "transformers/examples/run_flue.py", line 737, in main cache_dir=args.cache_dir if args.cache_dir else None, File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/site-packages/tra line 188, in from_pretrained config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **k File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/site-packages/tra line 240, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/site-packages/tra line 329, in _dict_from_json_file text = reader.read() File "/home/getalp/kelodjoe/anaconda3/envs/env/lib/python3.6/codecs.py", line (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid

Answer 7 · 2020-04-23T14:03:08.000Z

Hi @keloemma, I am not sure but I think that your error is because the data that you're using are not encoded in UTF-8. Maybe check your data to be sure that it is all valid UTF-8 characters ?