Order of torchsummary and model.eval() change model behaviour
maikefer opened this issue ยท 2 comments
Dear torchsummary developer(s),
I really appreciate the function of your model but I found a kinda big thing/bug/feature today that fucked up a lot of my development (and computation time).
Apparently, the order in which you use summary(model, ()) and model.eval() matters a lot. Here is an easy example:
without the help of torchsummary:
model0 = models.resnet18(pretrained=True)
model0.eval()
dummy_input0 = torch.from_numpy(np.ones(shape=(1, 3, 224, 224))).float()
with torch.no_grad():
output0 = model0(dummy_input0)
print(output0[0, :10])
this results in
[-0.0391, 0.1145, -1.7968, -1.2343, -0.8190, 0.3240, -2.1866, -1.2877, -1.9019, -0.7315]
Awesome - now we know what we can expect as output.
Let's try with torchsummary after the .eval():
model1 = models.resnet18(pretrained=True)
model1.eval()
torchsummary.summary(model1, (3, 224, 224))
dummy_input1 = torch.from_numpy(np.ones(shape=(1, 3, 224, 224))).float()
with torch.no_grad():
output1 = model1(dummy_input1)
print(output1[0, :10])
results in
[-0.0391, 0.1145, -1.7968, -1.2343, -0.8190, 0.3240, -2.1866, -1.2877, -1.9019, -0.7315]`
Still working - cool.
Let's now mix some things up a bit: Let's try calling torchsummary before .eval():
model2 = models.resnet18(pretrained=True)
torchsummary.summary(model2, (3, 224, 224))
model2.eval()
dummy_input2 = torch.from_numpy(np.ones(shape=(1, 3, 224, 224))).float()
with torch.no_grad():
output2 = model2(dummy_input2)
print(output2[0, :10])
results in
[-0.7597, -0.2707, -1.3314, -1.1444, -0.6266, -0.0564, -1.3227, -0.8387, -1.7826, -0.7907]
well now something weird happened.
This happens on cpu, gpu and 2 gpus. I'm using torchsummary 1.5.1, Python 3.7, PyTorch 1.7.0, CUDA 10.1. and Cudnn 7.6
Can you explain why this happens and if this is intended or a bug? It would be great if you added this behaviour to your documentation (or did I miss something?).
Thanks!
Ugh, and happy christmas! ๐ ๐
maybe also checkout the solution for this bug in this very similar repo: TylerYep/torchinfo#21
Can confirm that we experienced this as well when using torchsummary. This resulted in the output of our evaluation scripts being inconsistent and it took a while for us to realize the innocuous looking torchsummary.summary(model, size)
call was the cause of the problem.