Exact pytorch version requirement
AshishSardana opened this issue · 3 comments
Hi @xlhex ,
I'm trying to run the train.py with pytorch 1.4 (docker container - nvcr.io/nvidia/pytorch:20.01-py3) which results in pytorch related error:
root@dc5fb4969999:/SceneGraphModification/code# python train.py --data-dir $DATA --epochs $EPOCH --seed 1 --ckpt-dir $CKPT_DIR --modification $FUSION --batch-size 256 --accumulation-steps 1 > $log
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1
"num_layers={}".format(dropout, num_layers))
Traceback (most recent call last):
File "train.py", line 166, in <module>
main()
File "train.py", line 133, in main
loss = model(samples["src_graph"], samples["src_text"], samples["tgt_graph"])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 167, in forward
_, node_outputs, _, edge_outputs = self.graph_dec(enc_info, tgt_graph["nodes"], tgt_graph["edges"])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 487, in forward
node_rnn_outputs, _, node_outputs = self.node_forward(enc_info, nodes["x"], nodes_lens)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 440, in node_forward
context, _ = self.node_att(rnn_outputs, enc_info["mem"], enc_info["mem_masks"])
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 366, in forward
align.masked_fill_(1 - mask, -float('inf'))
File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 394, in __rsub__
return _C._VariableFunctions.rsub(self, other)
RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead.
I've also tried running it with pytorch 1.8 which leads to another pytorch error:
root@1d6ce713c477:/SceneGraphModification/code# python train.py --data-dir $DATA --epochs $EPOCH --seed 1 --ckpt-dir $CKPT_DIR --modification $FUSION --batch-size 256 --accumulation-steps 1 > $log
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/rnn.py:58: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1
warnings.warn("dropout option adds dropout after all but last "
/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/data_utils.py:37: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:962.)
flat_edges = [edge.view(-1)[torch.tril(edge, -1).view(-1).nonzero()].view(-1) for edge in edges]
Traceback (most recent call last):
File "train.py", line 166, in <module>
main()
File "train.py", line 133, in main
loss = model(samples["src_graph"], samples["src_text"], samples["tgt_graph"])
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
result = self.forward(*input, **kwargs)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 167, in forward
_, node_outputs, _, edge_outputs = self.graph_dec(enc_info, tgt_graph["nodes"], tgt_graph["edges"])
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
result = self.forward(*input, **kwargs)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 487, in forward
node_rnn_outputs, _, node_outputs = self.node_forward(enc_info, nodes["x"], nodes_lens)
File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 435, in node_forward
padded_nodes_embeds = nn.utils.rnn.pack_padded_sequence(nodes_embeds, nodes_len, batch_first=True)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/utils/rnn.py", line 245, in pack_padded_sequence
_VF._pack_padded_sequence(input, lengths, batch_first)
RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor
Can you share the exact pytorch version (and if it helps, the cuda and cudnn versions too) that you've developed this codebase with?
Appreciate it!
Sorry I accidentally put pytorch1.4 on the readme file. We used pytorch1.1 for all experiments.
Hi @xlhex ,
I'm trying to run the train.py with pytorch 1.4 (docker container - nvcr.io/nvidia/pytorch:20.01-py3) which results in pytorch related error:
root@dc5fb4969999:/SceneGraphModification/code# python train.py --data-dir $DATA --epochs $EPOCH --seed 1 --ckpt-dir $CKPT_DIR --modification $FUSION --batch-size 256 --accumulation-steps 1 > $log /opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1 "num_layers={}".format(dropout, num_layers)) Traceback (most recent call last): File "train.py", line 166, in <module> main() File "train.py", line 133, in main loss = model(samples["src_graph"], samples["src_text"], samples["tgt_graph"]) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 167, in forward _, node_outputs, _, edge_outputs = self.graph_dec(enc_info, tgt_graph["nodes"], tgt_graph["edges"]) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 487, in forward node_rnn_outputs, _, node_outputs = self.node_forward(enc_info, nodes["x"], nodes_lens) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 440, in node_forward context, _ = self.node_att(rnn_outputs, enc_info["mem"], enc_info["mem_masks"]) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 366, in forward align.masked_fill_(1 - mask, -float('inf')) File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 394, in __rsub__ return _C._VariableFunctions.rsub(self, other) RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead.
I've also tried running it with pytorch 1.8 which leads to another pytorch error:
root@1d6ce713c477:/SceneGraphModification/code# python train.py --data-dir $DATA --epochs $EPOCH --seed 1 --ckpt-dir $CKPT_DIR --modification $FUSION --batch-size 256 --accumulation-steps 1 > $log /opt/conda/lib/python3.8/site-packages/torch/nn/modules/rnn.py:58: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1 warnings.warn("dropout option adds dropout after all but last " /media/d2b/ashish/tme/gaugan/SceneGraphModification/code/data_utils.py:37: UserWarning: This overload of nonzero is deprecated: nonzero() Consider using one of the following signatures instead: nonzero(*, bool as_tuple) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:962.) flat_edges = [edge.view(-1)[torch.tril(edge, -1).view(-1).nonzero()].view(-1) for edge in edges] Traceback (most recent call last): File "train.py", line 166, in <module> main() File "train.py", line 133, in main loss = model(samples["src_graph"], samples["src_text"], samples["tgt_graph"]) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl result = self.forward(*input, **kwargs) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 167, in forward _, node_outputs, _, edge_outputs = self.graph_dec(enc_info, tgt_graph["nodes"], tgt_graph["edges"]) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl result = self.forward(*input, **kwargs) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 487, in forward node_rnn_outputs, _, node_outputs = self.node_forward(enc_info, nodes["x"], nodes_lens) File "/media/d2b/ashish/tme/gaugan/SceneGraphModification/code/models.py", line 435, in node_forward padded_nodes_embeds = nn.utils.rnn.pack_padded_sequence(nodes_embeds, nodes_len, batch_first=True) File "/opt/conda/lib/python3.8/site-packages/torch/nn/utils/rnn.py", line 245, in pack_padded_sequence _VF._pack_padded_sequence(input, lengths, batch_first) RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor
Can you share the exact pytorch version (and if it helps, the cuda and cudnn versions too) that you've developed this codebase with?
Appreciate it!
Regarding cuda, we used cuda/10.0. Hope this help.
Thank you!