wanyao1992/code_summarization_public

Error while training the Hybrid Model: Function CatBackward returned an invalid gradient at index 1 - got [85, 1, 512] but expected shape compatible with [57, 1, 512] failed.

sawan16 opened this issue · 1 comments

### Run time Log:
python a2c-train.py -data dataset/train/processed_all.train.pt -save_dir dataset//result/ -embedding_w2v dataset/train/ -start_reinforce 10 -end_epoch 30 -critic_pretrain_epochs 10 -data_type hybrid -has_attn 1 -gpus 0
Start...

  • vocabulary size. source = 50004; target = 31415
  • number of XENT training sentences. 54426
  • number of PG training sentences. 54426
  • maximum batch size. 32
    Building model...
    use_critic: True
    /usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.3 and num_layers=1
    "num_layers={}".format(dropout, num_layers))
    model: Hybrid2SeqModel(
    (code_encoder): TreeEncoder(
    (word_lut): Embedding(50004, 512, padding_idx=0)
    (leaf_module): BinaryTreeLeafModule(
    (cx): Linear(in_features=512, out_features=512, bias=True)
    (ox): Linear(in_features=512, out_features=512, bias=True)
    )
    (composer): BinaryTreeComposer(
    (ilh): Linear(in_features=512, out_features=512, bias=True)
    (irh): Linear(in_features=512, out_features=512, bias=True)
    (lflh): Linear(in_features=512, out_features=512, bias=True)
    (lfrh): Linear(in_features=512, out_features=512, bias=True)
    (rflh): Linear(in_features=512, out_features=512, bias=True)
    (rfrh): Linear(in_features=512, out_features=512, bias=True)
    (ulh): Linear(in_features=512, out_features=512, bias=True)
    (urh): Linear(in_features=512, out_features=512, bias=True)
    )
    )
    (text_encoder): Encoder(
    (word_lut): Embedding(50004, 512, padding_idx=0)
    (rnn): LSTM(512, 512, dropout=0.3)
    )
    (decoder): HybridDecoder(
    (word_lut): Embedding(31415, 512, padding_idx=0)
    (rnn): StackedLSTM(
    (dropout): Dropout(p=0.3, inplace=False)
    (layers): ModuleList(
    (0): LSTMCell(1024, 512)
    )
    )
    (attn): HybridAttention(
    (linear_in): Linear(in_features=512, out_features=512, bias=False)
    (sm): Softmax(dim=None)
    (linear_out): Linear(in_features=2048, out_features=512, bias=False)
    (tanh): Tanh()
    )
    (dropout): Dropout(p=0.3, inplace=False)
    )
    (generator): BaseGenerator(
    (generator): Linear(in_features=512, out_features=31415, bias=True)
    )
    )
    optim: <lib.train.Optim.Optim object at 0x7f34d70f0c50>
    opt.start_reinforce: 10
  • number of parameters: 92592823
    opt.eval: False
    opt.eval_sample: False
    supervised_data.src: 54426
    supervised_data.tgt: 54426
    supervised_data.trees: 54426
    supervised_data.leafs: 54426
    supervised training..
    start_epoch: 1
  • XENT epoch *
    Model optim lr: 0.001
    <class 'lib.data.Dataset.Dataset'> 54426
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1351: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
    warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1340: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
    warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
    /content/drive/My Drive/notebooks/Python_method_name_prediction/code_summarization_public/lib/model/HybridAttention.py:34: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
    attn_tree = self.sm(attn_tree)
    /content/drive/My Drive/notebooks/Python_method_name_prediction/code_summarization_public/lib/model/HybridAttention.py:36: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
    attn_txt = self.sm(attn_txt)
    outputs: torch.Size([26, 32, 512])
    /content/drive/My Drive/notebooks/Python_method_name_prediction/code_summarization_public/lib/metric/Loss.py:8: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
    log_dist = F.log_softmax(logits)
    loss value: 3042.23095703125
    ---else---
    torch.Size([26, 32, 512])
    torch.Size([26, 32, 512])
    Traceback (most recent call last):
    File "a2c-train.py", line 339, in
    main()
    File "a2c-train.py", line 321, in main
    xent_trainer.train(opt.start_epoch, opt.start_reinforce - 1, start_time)
    File "/content/drive/My Drive/notebooks/Python_method_name_prediction/code_summarization_public/lib/train/Trainer.py", line 30, in train
    train_loss = self.train_epoch(epoch)
    File "/content/drive/My Drive/notebooks/Python_method_name_prediction/code_summarization_public/lib/train/Trainer.py", line 85, in train_epoch
    loss = self.model.backward(outputs, targets, weights, num_words, self.loss_func)
    File "/content/drive/My Drive/notebooks/Python_method_name_prediction/code_summarization_public/lib/model/EncoderDecoder.py", line 547, in backward
    outputs.backward(grad_output)
    File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 195, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
    File "/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py", line 99, in backward
    allow_unreachable=True) # allow_unreachable flag
    RuntimeError: Function CatBackward returned an invalid gradient at index 1 - got [85, 1, 512] but expected shape compatible with [57, 1, 512]
    failed.

I have met the same problem, is there any solution for this?