Sizes of tensors must match except in dimension 1. Expected size 26 but got size 25 for tensor number 1 in the list.
Closed this issue · 3 comments
When loading a model from torch.hub I am getting the following error: Sizes of tensors must match except in dimension 1. Expected size 26 but got size 25 for tensor number 1 in the list
Minimal working example:
import torch
from rfa_toolbox import create_graph_from_pytorch_model, visualize_architecture
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
graph = create_graph_from_pytorch_model(model)
Please see for more info: issues/6455
This is caused by some control flow within the forward-pass of YoloV5. RFA-Toolbox uses the JIT-Compiler to extract the graph of a model, which only evaluated the parts of the control flow touched by the forward-pass during the trace.
For some reason, this requires any YoloV5-model to be in train-mode in order to be traceable by the JIT-Compiler of PyTorch.
If you put the model in train mode, it works.
Here is some example code, that fixed the issue:
import torch
from rfa_toolbox import create_graph_from_pytorch_model, visualize_architecture
# Model
model_name = "YoloV5s"
model = torch.hub.load('ultralytics/yolov5', f'{model_name.lower()}') # or yolov5m, yolov5l, yolov5x, custom
#model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx') # or yolov5m, yolov5l, yolov5x, custom
model.train()
graph = create_graph_from_pytorch_model(model.cpu(), (4, 3, 640, 640))
visualize_architecture(graph, model_name=f"{model_name}", input_res=10000).render(f"{model_name}")
@MLRichter this does not solve the problem for me, it results in the same error again, please have a look at the log:
Using cache found in /root/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 🚀 2022-1-31 torch 1.10.1+cu113 CUDA:0 (NVIDIA GeForce RTX 3090, 24268MiB)
Fusing layers...
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape...
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_3645/3417301980.py in <module>
7 #model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx') # or yolov5m, yolov5l, yolov5x, custom
8 model.train()
----> 9 graph = create_graph_from_pytorch_model(model.cpu(), (4, 3, 640, 640))
10 visualize_architecture(graph, model_name=f"{model_name}", input_res=10000).render(f"{model_name}")
/opt/conda/lib/python3.8/site-packages/rfa_toolbox/encodings/pytorch/ingest_architecture.py in create_graph_from_model(model, filter_rf, input_res, custom_layers)
411 else KNOWN_FILTER_MAPPING[filter_rf]
412 )
--> 413 tm = torch.jit.trace(model, (torch.randn(*input_res),))
414 return make_graph(
415 tm, filter_rf=filter_func, ref_mod=model, classes_to_not_visit=custom_layers
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
739
740 if isinstance(func, torch.nn.Module):
--> 741 return trace_module(
742 func,
743 {"forward": example_inputs},
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py in trace_module(mod, inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
956 example_inputs = make_tuple(example_inputs)
957
--> 958 module._c._create_method_from_trace(
959 method_name,
960 func,
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
1088 recording_scopes = False
1089 try:
-> 1090 result = self.forward(*input, **kwargs)
1091 finally:
1092 if recording_scopes:
/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
26 def decorate_context(*args, **kwargs):
27 with self.__class__():
---> 28 return func(*args, **kwargs)
29 return cast(F, decorate_context)
30
~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, imgs, size, augment, profile)
508 if isinstance(imgs, torch.Tensor): # torch
509 with amp.autocast(enabled=autocast):
--> 510 return self.model(imgs.to(p.device).type_as(p), augment, profile) # inference
511
512 # Pre-process
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
1088 recording_scopes = False
1089 try:
-> 1090 result = self.forward(*input, **kwargs)
1091 finally:
1092 if recording_scopes:
~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, im, augment, visualize, val)
397 b, ch, h, w = im.shape # batch, channel, height, width
398 if self.pt or self.jit: # PyTorch
--> 399 y = self.model(im) if self.jit else self.model(im, augment=augment, visualize=visualize)
400 return y if val else y[0]
401 elif self.dnn: # ONNX OpenCV DNN
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
1088 recording_scopes = False
1089 try:
-> 1090 result = self.forward(*input, **kwargs)
1091 finally:
1092 if recording_scopes:
~/.cache/torch/hub/ultralytics_yolov5_master/models/yolo.py in forward(self, x, augment, profile, visualize)
124 if augment:
125 return self._forward_augment(x) # augmented inference, None
--> 126 return self._forward_once(x, profile, visualize) # single-scale inference, train
127
128 def _forward_augment(self, x):
~/.cache/torch/hub/ultralytics_yolov5_master/models/yolo.py in _forward_once(self, x, profile, visualize)
147 if profile:
148 self._profile_one_layer(m, x, dt)
--> 149 x = m(x) # run
150 y.append(x if m.i in self.save else None) # save output
151 if visualize:
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
1088 recording_scopes = False
1089 try:
-> 1090 result = self.forward(*input, **kwargs)
1091 finally:
1092 if recording_scopes:
~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, x)
273
274 def forward(self, x):
--> 275 return torch.cat(x, self.d)
276
277
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 26 but got size 25 for tensor number 1 in the list.
I am using the latest Yolov5 docker, (I see that the docker nowadays also uses torch 1.10.1+cu113) but still getting an error:
I get an error with PyTorch version, not sure if this is maybe a compatibility issue:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behavior is the source of the following dependency conflicts.
torchtext 0.11.0a0 requires torch==1.10.0a0+0aef44c, but you have torch 1.10.1+cu113 which is incompatible.
I was able to reproduce the error and fix the bug in your code.
import torch
from rfa_toolbox import create_graph_from_pytorch_model, visualize_architecture
def main():
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
model.train()
model.cpu()
graph = create_graph_from_pytorch_model(model, input_res=(4, 3, 640, 640))
visualize_architecture(graph, "model").view()
if __name__ == "__main__":
main()
I will elaborate on the changes to the code:
First, the model was not deployed on the CPU, while the input tensor used for tracing was, which would cause a raise.
I should probably add some logic that detects the device automatically, though, since this is a trap that probably many will fall in to.
Second, YoloV5 has a bit exotic input size requirements of 640x640 pixel (compared to classifiers, which use a lot smaller resolution by default and a way less picky in their resolution). The shape of the tensor used for tracing (obtaining the graph shape) is by default 399x399 pixels, which is enough for most classifiers, but too small for YoloV5.
The tuple that was provided in my code provides the shape for the larger tracing tensor. However, since the posting of this code version 1.4.2 has released, which slightly changed the interface bumping this specific argument to the third place, this is the reason, the code of this post provides it as a keyword-argument, which the older version didn't.
Third, the model needs to be in train mode, since otherwise control-flow is enabled during the forward pass which is directly dependent on the input of the model. Behavior like this is not supported by the JIT-compiler of PyTorch, which we rely on to extract the compute-graph, and will therefore raise and error in the process.