Size do not match error when training on the spider dataset

Question

Size do not match error when training on the spider dataset

Zoeyyao27 opened this issue 2 years ago · 1 comments

When I ran CUDA_VISIBLE_DEVICES="0" python3 seq2seq/run_seq2seq.py configs/spider/train_spider_rasat_small.json
I get the following error:
Traceback (most recent call last): File "/home/yaoy/convertsql2tree/RASAT/seq2seq/run_seq2seq.py", line 292, in <module> main() File "/home/yaoy/convertsql2tree/RASAT/seq2seq/run_seq2seq.py", line 237, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/transformers/trainer.py", line 1325, in train tr_loss_step = self.training_step(model, inputs) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/transformers/trainer.py", line 1884, in training_step loss = self.compute_loss(model, inputs) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/transformers/trainer.py", line 1916, in compute_loss outputs = model(**inputs) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 1742, in forward encoder_outputs = self.encoder( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 1135, in forward layer_outputs = checkpoint( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 177, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 75, in forward outputs = run_function(*args) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 1131, in custom_forward return tuple(module(*inputs, use_cache, output_attentions)) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 755, in forward self_attention_outputs = self.layer[0]( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 658, in forward attention_output = self.SelfAttention( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 587, in forward scores = relative_attention_logits(query_states, key_states, relation_k_states) # [batch, heads, num queries, num kvs] File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 512, in relative_attention_logits q_tr_t_matmul = torch.matmul(q_t, r_t) RuntimeError: The size of tensor a (472) must match the size of tensor b (468) at non-singleton dimension 1
I try to print the size of the q_t and r_t, and get the following result:
q_t shape: torch.Size([8, 472, 8, 64])
r_t shape: torch.Size([8, 468, 64, 468])

I suppose the second dimension should all be 512? Does anyone have any idea of what went wrong and how to modify it?

Answer 1 · 2023-01-31T20:02:29.000Z

are you starting this training from an existing model checkpoint or from scratch?