salesforce/simpletod

Issues on end2end evaluation

Opened this issue · 13 comments

When I use the default parameters to train the end2end model, the results obtained are very different from those in the paper. There is also a gap of about 5% in the dst evaluation. Is there any point that needs to be modified, thx.

Hello, I am also trying to reproduce the result, I notice there are many checkpoints saved, which checkpoint do you use? How do you figure out which one is the best?

hi, I have a problem
main.py: error: argument --per_gpu_train_batch_size: expected one argument
How do we solve this problem??
Give me a hand with this.

When I use the default parameters to train the end2end model, the results obtained are very different from those in the paper. There is also a gap of about 5% in the dst evaluation. Is there any point that needs to be modified, thx.

Try ignoring the "not mention" ,"none", "dontcare" value in both generated and target when evaluating the joint accuracy.

hi, I have a problem
main.py: error: argument --per_gpu_train_batch_size: expected one argument
How do we solve this problem??
Give me a hand with this.

Guess you need to specify your batch size to this argument.

I tried to run the end2end file by train_end2end.bat %%GPU gpt2 gpt2

but I got this error


Traceback (most recent call last):
  File "D:\simpletod-master\simpletod-master\models\configuration_utils.py", line 257, in get_config_dict
    raise EnvironmentError
OSError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\simpletod-master\simpletod-master\main.py", line 419, in <module>
    main()
  File "D:\simpletod-master\simpletod-master\main.py", line 409, in main
    model = model_class.from_pretrained(checkpoint)
  File "D:\simpletod-master\simpletod-master\models\modeling_utils.py", line 383, in from_pretrained
    config, model_kwargs = cls.config_class.from_pretrained(
  File "D:\simpletod-master\simpletod-master\models\configuration_utils.py", line 189, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\simpletod-master\simpletod-master\models\configuration_utils.py", line 273, in get_config_dict
    raise EnvironmentError(msg)
OSError: Model name '%OUTPUT' was not found in model name list. We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/%OUTPUT/config.json' was a p
ath, a model identifier, or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path
 or url.

I downloaded config.json , pytorch_model.bin and tf_model.h5 from huggingface for gpt2 model and put them in the project folder and another copy inside output folder.
How can I solve this issue, please?

FYI,
OS: windows and the train_end2end.bat is :

set MODEL=%%2
set MODEL_NAME=%%3
set BATCH=%%4
set OUTPUT=output\%%{MODEL_NAME}

set TRAIN_FILE=".\resources\gpt2\train.history_belief_action_sys_delex"
set TEST_FILE=".\resources\gpt2\val.history_belief_action_sys_delex"


set CUDA_VISIBLE_DEVICES=%%1
python main.py --output_dir=%%OUTPUT    --model_type=%%MODEL  --model_name_or_path=%%MODEL_NAME    --do_train    --train_data_file=".\resources\gpt2\train.history_belief_action_sys_delex"  --do_eval  --eval_data_file=".\resources\gpt2\val.history_belief_action_sys_delex"  --evaluate_during_training  --save_steps 10000   --logging_steps 1000   --per_gpu_train_batch_size 4    --num_train_epochs 100

and I put tokenizer.max_len = 1024 in main.py because the file show an error when run and said the max_len is not defined

I tried to run the end2end file by train_end2end.bat %%GPU gpt2 gpt2

but I got this error

I'd recommend directly using huggingface language model's training script.

which training script?? sorry but I am new in this field.

which training script?? sorry but I am new in this field.

https://github.com/huggingface/transformers/tree/master/examples/pytorch/language-modeling

run_clm.py and run_clm_no_trainer.py can do the work.

which training script?? sorry but I am new in this field.

https://github.com/huggingface/transformers/tree/master/examples/pytorch/language-modeling

run_clm.py and run_clm_no_trainer.py can do the work.

I did the following:

1- copy run_clm.py and run_clm_no_trainer.py to the project dir
2- run
python run_clm.py --model_name_or_path gpt2 --train_file==".\resources\gpt2\train.history_belief_action_sys_delex.json" --validation_file=".\resources\gpt2\val.history_belief_action_sys_delex.json" --do_train --do_eval --output_dir /tmp/test-clm --num_train_epochs 2

thus, I got this error


Traceback (most recent call last):
  File "D:\simpletod-master\simpletod-master\run_clm.py", line 51, in <module>
    check_min_version("4.7.0.dev0")
  File "D:\simpletod-master\venv\lib\site-packages\transformers\utils\__init__.py", line 32, in check_min_version
    raise ImportError(
ImportError: This example requires a source install from � Transformers (see `https://huggingface.co/transformers/installation.html#installing-from-source`
), but the version found is 4.4.2.
Check out https://huggingface.co/transformers/examples.html for the examples corresponding to other versions of � Transformers.

which training script?? sorry but I am new in this field.

https://github.com/huggingface/transformers/tree/master/examples/pytorch/language-modeling

run_clm.py and run_clm_no_trainer.py can do the work.

and when I run

python run_clm_no_trainer.py --model_name_or_path gpt2 --train_file==".\resources\gpt2\train.history_belief_action_sys_delex" --validation_file=".\resources\gpt2\val.history_belief_action_sys_delex" --output_dir /tmp/test-clm --num_train_epochs 2

It gives me the error

Traceback (most recent call last):
  File "D:\simpletod-master\simpletod-master\un_clm_no_trainer.py", line 458, in <module>
    main()
  File "D:\simpletod-master\simpletod-master\run_clm_no_trainer.py", line 192, in main
    args = parse_args()
  File "D:\simpletod-master\simpletod-master\un_clm_no_trainer.py", line 180, in parse_args
    assert extension in ["csv", "json", "txt"], "`train_file` should be a csv, json or txt file."
AssertionError: `train_file` should be a csv, json or txt file.

but I have both files which have to use for training
.\resources\gpt2\train.history_belief_action_sys_delex
\resources\gpt2\val.history_belief_action_sys_delex

which training script?? sorry but I am new in this field.

https://github.com/huggingface/transformers/tree/master/examples/pytorch/language-modeling
run_clm.py and run_clm_no_trainer.py can do the work.

and when I run

python run_clm_no_trainer.py --model_name_or_path gpt2 --train_file==".\resources\gpt2\train.history_belief_action_sys_delex" --validation_file=".\resources\gpt2\val.history_belief_action_sys_delex" --output_dir /tmp/test-clm --num_train_epochs 2

It gives me the error

Traceback (most recent call last):
  File "D:\simpletod-master\simpletod-master\un_clm_no_trainer.py", line 458, in <module>
    main()
  File "D:\simpletod-master\simpletod-master\run_clm_no_trainer.py", line 192, in main
    args = parse_args()
  File "D:\simpletod-master\simpletod-master\un_clm_no_trainer.py", line 180, in parse_args
    assert extension in ["csv", "json", "txt"], "`train_file` should be a csv, json or txt file."
AssertionError: `train_file` should be a csv, json or txt file.

but I have both files which have to use for training
.\resources\gpt2\train.history_belief_action_sys_delex
\resources\gpt2\val.history_belief_action_sys_delex

change the filename to .txt file

python run_clm_no_trainer.py --model_name_or_path gpt2 --train_file==".\resources\gpt2\train.history_belief_action_sys_delex" --validation_file=".\resources\gpt2\val.history_belief_action_sys_delex" --output_dir /tmp/test-clm --num_train_epochs 2

>>python run_clm_no_trainer.py --model_name_or_path gpt2 --train_file==.\resources\gpt2\A.txt --validation_file=.\resources\gpt2\B.txt  --output_dir /tmp/test-clm --num_train_epochs 2

it gives (no path) error

05/30/2021 11:30:41 - INFO - __main__ -   Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Use FP16 precision: False

Traceback (most recent call last):
  File "D:\simpletod-master\simpletod-master\un_clm_no_trainer.py", line 458, in <module>
    main()
  File "D:\simpletod-master\simpletod-master\run_clm_no_trainer.py", line 250, in main
    raw_datasets = load_dataset(extension, data_files=data_files)
  File "C:\Users\miniconda3\envs\torch\lib\site-packages\datasets\load.py", line 730, in load_dataset
    builder_instance: DatasetBuilder = builder_cls(
  File "C:\Users\miniconda3\envs\torch\lib\site-packages\datasets\builder.py", line 234, in __init__
    self.config, self.config_id = self._create_builder_config(
  File "C:\Users\miniconda3\envs\torch\lib\site-packages\datasets\builder.py", line 348, in _create_builder_confi
g
    config_id = builder_config.create_config_id(config_kwargs, custom_features=custom_features)
  File "C:\Users\miniconda3\envs\torch\lib\site-packages\datasets\builder.py", line 153, in create_config_id
    m.update(str(os.path.getmtime(data_file)))
  File "C:\Users\\miniconda3\envs\torch\lib\genericpath.py", line 55, in getmtime
    return os.stat(filename).st_mtime
FileNotFoundError: [WinError 3] The system cannot find the path specified: '=.\\resources\\gpt2\\A.txt'

@pleomax0730 could you see [https://github.com//issues/27] please.