RuntimeError: The size of tensor a (400) must match the size of tensor b (0) at non-singleton dimension 1
Closed this issue · 11 comments
When I use the following script trains the transformer model on CNN-DMhttp://opennmt.net/OpenNMT-py/Summarization.html
onmt_train -data data/cnndm/CNNDM \ -save_model models/cnndm \ -layers 4 \ -rnn_size 512 \ -word_vec_size 512 \ -max_grad_norm 0 \ -optim adam \ -encoder_type transformer \ -decoder_type transformer \ -position_encoding \ -dropout 0\.2 \ -param_init 0 \ -warmup_steps 8000 \ -learning_rate 2 \ -decay_method noam \ -label_smoothing 0.1 \ -adam_beta2 0.998 \ -batch_size 4096 \ -batch_type tokens \ -normalization tokens \ -max_generator_batches 2 \ -train_steps 200000 \ -accum_count 4 \ -share_embeddings \ -copy_attn \ -param_init_glorot \ -world_size 2 \ -gpu_ranks 0 1
I meet the following error:
Traceback (most recent call last):
File "train.py", line 438, in
main()
File "train.py", line 430, in main
train_model(model, fields, optim, data_type, model_opt)
File "train.py", line 252, in train_model
train_stats = trainer.train(train_iter, epoch, report_func)
File "/home/cai/yym/ddl/final/OpenNMT-py-copy_constraint/onmt/Trainer.py", line 178, in train
report_stats, normalization)
File "/home/cai/yym/ddl/final/OpenNMT-py-copy_constraint/onmt/Trainer.py", line 311, in _gradient_accumulation
trunc_size, self.shard_size, normalization)
File "/home/cai/yym/ddl/final/OpenNMT-py-copy_constraint/onmt/Loss.py", line 123, in sharded_compute_loss
loss, stats = self._compute_loss(batch, **shard)
File "/home/cai/yym/ddl/final/OpenNMT-py-copy_constraint/onmt/modules/CopyGenerator.py", line 201, in _compute_loss
batch.src_map)
File "/home/cai/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/cai/yym/ddl/final/OpenNMT-py-copy_constraint/onmt/modules/CopyGenerator.py", line 99, in forward
mul_attn = torch.mul(attn, tags) * 2
RuntimeError: The size of tensor a (400) must match the size of tensor b (0) at non-singleton dimension 1
hi
have you solved the problem? i have the same problem .
if you do, plz told me the solution
hi
have you solved the problem? i have the same problem .
if you do, plz told me the solutionSorry, I forget how I got this error, but now it works.
that's amazing. now it still can't work. This error will be reported as long as the parameter --copy_attn
is available. When I debug, as the error report says, tags are empty, so that its dimension is 0. So did you remove the parameter copy_attn?
hi
have you solved the problem? i have the same problem .
if you do, plz told me the solutionSorry, I forget how I got this error, but now it works.
that's amazing. now it still can't work. This error will be reported as long as the parameter --copy_attn
is available. When I debug, as the error report says, tags are empty, so that its dimension is 0. So did you remove the parameter copy_attn?
Sorry, I make a mistake and just find it. So I reopen this issue.
hi
have you solved the problem? i have the same problem .
if you do, plz told me the solutionSorry, I forget how I got this error, but now it works.
that's amazing. now it still can't work. This error will be reported as long as the parameter --copy_attn
is available. When I debug, as the error report says, tags are empty, so that its dimension is 0. So did you remove the parameter copy_attn?Sorry, I make a mistake and just find it. So I reopen this issue.
thank you. So did you manually pass in tags? Otherwise, I don't think the problem of empty tags can be solved, and it will still report an error. Or maybe our version is different... In your version, tags is not empty, so you solved it. that‘s my version
torch 0.3.1
torchtext 0.2.3
hi
have you solved the problem? i have the same problem .
if you do, plz told me the solutionSorry, I forget how I got this error, but now it works.
that's amazing. now it still can't work. This error will be reported as long as the parameter --copy_attn
is available. When I debug, as the error report says, tags are empty, so that its dimension is 0. So did you remove the parameter copy_attn?Sorry, I make a mistake and just find it. So I reopen this issue.
thank you. So did you manually pass in tags? Otherwise, I don't think the problem of empty tags can be solved, and it will still report an error. Or maybe our version is different... In your version, tags is not empty, so you solved it. that‘s my version
torch 0.3.1
torchtext 0.2.3
In my version, tags is empty too.
I think I should use original OpenNMT when training.
I think I should use original OpenNMT when training.
thanks. you mean use original OpenNMT when training can solve this "empty tag" problem?
i will try it , much thanks
I think I should use original OpenNMT when training.
thanks. you mean use original OpenNMT when training can solve this "empty tag" problem?
i will try it , much thanks
Yes, I try it today, but since the original OpenNMT has changed a lot, it not works when testing. Do you know how to fork the same version original OpenNMT as this branch?
I think I should use original OpenNMT when training.
thanks. you mean use original OpenNMT when training can solve this "empty tag" problem?
i will try it , much thanksYes, I try it today, but since the original OpenNMT has changed a lot, it not works when testing. Do you know how to fork the same version original OpenNMT as this branch?
Sorry, I don't know which version of the changes the author made. Maybe after 0.7? To be specific, I'd better compare the corresponding codes... sad
I think I should use original OpenNMT when training.
thanks. you mean use original OpenNMT when training can solve this "empty tag" problem?
i will try it , much thanksYes, I try it today, but since the original OpenNMT has changed a lot, it not works when testing. Do you know how to fork the same version original OpenNMT as this branch?
Sorry, I don't know which version of the changes the author made. Maybe after 0.7? To be specific, I'd better compare the corresponding codes... sad
May I communicate with you privately on CSDN?
I also have the same problem, please help me to fix this. Thank you