Please help me?
Deerzh opened this issue · 7 comments
Q1: Can you tell me how to set the appropriate {YOUR_OTHER_ARGUMENTS} in this command: accelerate launch transformers_trainer_ddp.py --batch_size=30 {YOUR_OTHER_ARGUMENTS}.
Q2:when I run this command: python trainer.py --embedder_type=bert-large-cased,
an error occurred:
Traceback (most recent call last):
File "trainer.py", line 12, in
from src.config import context_models, get_metric
ImportError: cannot import name 'context_models' from 'src.config' (/home/zhang/compatibility_analysis/pytorch_neural_crf/src/config/init.py)
Can you help me fix this issue?
{YOUR_OTHER_ARGUMENTS}
can be left empty. Or you can refer to all the arguments here: https://github.com/allanj/pytorch_neural_crf/blob/master/transformers_trainer.py#L29-L61- Please try to pull the latest version. It is fixed now.
I update the code,but errors still exist.
Error1. when I run this command:python trainer.py --embedder_type=bert-large-cased
error like this :
usage: trainer.py [-h] [--device {cpu,cuda:0,cuda:1,cuda:2}] [--seed SEED]
[--dataset DATASET] [--embedding_file EMBEDDING_FILE]
[--embedding_dim EMBEDDING_DIM] [--optimizer OPTIMIZER]
[--learning_rate LEARNING_RATE] [--l2 L2]
[--lr_decay LR_DECAY] [--batch_size BATCH_SIZE]
[--num_epochs NUM_EPOCHS] [--train_num TRAIN_NUM]
[--dev_num DEV_NUM] [--test_num TEST_NUM]
[--max_no_incre MAX_NO_INCRE] [--model_folder MODEL_FOLDER]
[--hidden_dim HIDDEN_DIM] [--dropout DROPOUT]
[--use_char_rnn {0,1}] [--static_context_emb {none,elmo}]
[--add_iobes_constraint {0,1}]
trainer.py: error: unrecognized arguments: --embedder_type=bert-large-cased
Error: if I left this {YOUR_OTHER_ARGUMENTS} empty, error still occurred :
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 22, in
import datasets
ModuleNotFoundError: No module named 'datasets'
Traceback (most recent call last):
File "/home/zhang/anaconda3/envs/neural/bin/accelerate", line 8, in
sys.exit(main())
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhang/anaconda3/envs/neural/bin/python', 'transformers_trainer_ddp.py', '--batch_size=30']' returned non-zero exit status 1.
Following the README, you should run transformer_trainer
rather than trainer.py
For the second one..
you need to
pip install datasets
I just updated the README to include that. Thanks
Following the README, you should run
transformer_trainer
rather thantrainer.py
Thank you for your reply,but there are still have some questions about this.
Q1:Is that I firstly run transformer_trainer and secondly run trainer.py or just run transformer_trainer.py? I don't understand your meaning.
Because if I run trainer.py command with '--embedder_type=bert-large-cased' argument,it will raise an error,however if I run trainer.py without arguments, it will be successfully?
Q2 : I have pip install datasets.but when I run accelerate launch transformers_trainer_ddp.py --batch_size=30, error still occurred,like this:
The following values were not passed to accelerate launch
and had defaults used instead:
--num_processes
was set to a value of 1
--num_machines
was set to a value of 1
--mixed_precision
was set to a value of 'no'
--num_cpu_threads_per_process
was set to 52
to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config
.
09/02/2022 16:16:35 - INFO - main - seed: 42
09/02/2022 16:16:35 - INFO - main - dataset: conll2003
09/02/2022 16:16:35 - INFO - main - optimizer: adamw
09/02/2022 16:16:35 - INFO - main - learning_rate: 2e-05
09/02/2022 16:16:35 - INFO - main - momentum: 0.0
09/02/2022 16:16:35 - INFO - main - l2: 1e-08
09/02/2022 16:16:35 - INFO - main - lr_decay: 0
09/02/2022 16:16:35 - INFO - main - batch_size: 30
09/02/2022 16:16:35 - INFO - main - num_epochs: 1
09/02/2022 16:16:35 - INFO - main - train_num: -1
09/02/2022 16:16:35 - INFO - main - dev_num: -1
09/02/2022 16:16:35 - INFO - main - test_num: -1
09/02/2022 16:16:35 - INFO - main - max_no_incre: 80
09/02/2022 16:16:35 - INFO - main - max_grad_norm: 1.0
09/02/2022 16:16:35 - INFO - main - fp16: 1
09/02/2022 16:16:35 - INFO - main - model_folder: english_model
09/02/2022 16:16:35 - INFO - main - hidden_dim: 0
09/02/2022 16:16:35 - INFO - main - dropout: 0.5
09/02/2022 16:16:35 - INFO - main - embedder_type: roberta-base
09/02/2022 16:16:35 - INFO - main - add_iobes_constraint: 0
09/02/2022 16:16:35 - INFO - main - print_detail_f1: 0
09/02/2022 16:16:35 - INFO - main - earlystop_atr: micro
09/02/2022 16:16:35 - INFO - main - mode: train
09/02/2022 16:16:35 - INFO - main - test_file: data/conll2003/test.txt
Downloading builder script: 6.33kB [00:00, 2.49MB/s]
09/02/2022 16:16:45 - INFO - main - [Data Info] Tokenizing the instances using 'roberta-base' tokenizer
09/02/2022 16:16:55 - INFO - main - [Data Info] Reading dataset from:
data/conll2003/train.txt
data/conll2003/dev.txt
data/conll2003/test.txt
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/train.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████████| 300/300 [00:00<00:00, 855980.41it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 14
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Using the training set to build label index
09/02/2022 16:16:55 - INFO - src.data.data_utils - #labels: 16
09/02/2022 16:16:55 - INFO - src.data.data_utils - label 2idx: {'': 0, 'O': 1, 'S-ORG': 2, 'S-MISC': 3, 'B-PER': 4, 'E-PER': 5, 'S-LOC': 6, 'B-ORG': 7, 'E-ORG': 8, 'I-PER': 9, 'S-PER': 10, 'B-MISC': 11, 'I-MISC': 12, 'E-MISC': 13, '': 14, '': 15}
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/dev.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|█████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 213995.10it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 2
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Reading file: data/conll2003/test.txt, labels will be converted to IOBES encoding
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] Modify src/data/transformers_dataset.read_txt function if you have other requirements
100%|███████████████████████████████████████████████████| 50350/50350 [00:00<00:00, 895523.33it/s]
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - number of sentences: 3684
09/02/2022 16:16:55 - INFO - src.data.transformers_dataset - [Data Info] We are not limiting the max length in tokenizer. You should be aware of that
Traceback (most recent call last):
File "transformers_trainer_ddp.py", line 284, in
main()
File "transformers_trainer_ddp.py", line 252, in main
test_dataset = TransformersNERDataset(conf.test_file, tokenizer, number=conf.test_num, label2idx=train_dataset.label2idx, is_train=False)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 94, in init
self.insts_ids = convert_instances_to_feature_tensors(insts, tokenizer, label2idx)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 53, in convert_instances_to_feature_tensors
label_ids = [label2idx[label] for label in labels] if labels else [-100] * len(words)
File "/home/zhang/compatibility_analysis/pytorch_neural_crf/src/data/transformers_dataset.py", line 53, in
label_ids = [label2idx[label] for label in labels] if labels else [-100] * len(words)
KeyError: 'B-LOC'
Traceback (most recent call last):
File "/home/zhang/anaconda3/envs/neural/bin/accelerate", line 8, in
sys.exit(main())
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/home/zhang/anaconda3/envs/neural/lib/python3.7/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhang/anaconda3/envs/neural/bin/python', 'transformers_trainer_ddp.py', '--batch_size=30']' returned non-zero exit status 1.
You have a Label 'B-LOC' that does not exist in your training set
feel free to reopen the issue