Missing keys with vanilla repository

Question

Missing keys with vanilla repository

praveenVnktsh opened this issue 4 years ago · 2 comments

The repository with the pretrained weights throw the following errors:

Missing key(s) in state_dict: "b10.batchnorm1.weight", "b10.batchnorm1.bias", "b10.batchnorm1.running_mean", "b10.batchnorm1.running_var", "b10.batchnorm3.weight", "b10.batchnorm3.bias", "b10.batchnorm3.running_mean", "b10.batchnorm3.running_var", "b11.batchnorm1.weight", "b11.batchnorm1.bias", "b11.batchnorm1.running_mean", "b11.batchnorm1.running_var", "b11.batchnorm3.weight", "b11.batchnorm3.bias", "b11.batchnorm3.running_mean", "b11.batchnorm3.running_var", "b12.batchnorm1.weight", "b12.batchnorm1.bias", "b12.batchnorm1.running_mean", "b12.batchnorm1.running_var", "b12.batchnorm3.weight", "b12.batchnorm3.bias", "b12.batchnorm3.running_mean", "b12.batchnorm3.running_var", "b13.batchnorm1.weight", "b13.batchnorm1.bias", "b13.batchnorm1.running_mean", "b13.batchnorm1.running_var", "b13.batchnorm3.weight", "b13.batchnorm3.bias", "b13.batchnorm3.running_mean", "b13.batchnorm3.running_var", "b14.batchnorm1.weight", "b14.batchnorm1.bias", "b14.batchnorm1.running_mean", "b14.batchnorm1.running_var", "b14.batchnorm3.weight", "b14.batchnorm3.bias", "b14.batchnorm3.running_mean", "b14.batchnorm3.running_var", "b20.batchnorm1.weight", "b20.batchnorm1.bias", "b20.batchnorm1.running_mean", "b20.batchnorm1.running_var", "b20.batchnorm3.weight", "b20.batchnorm3.bias", "b20.batchnorm3.running_mean", "b20.batchnorm3.running_var", "b21.batchnorm1.weight", "b21.batchnorm1.bias", "b21.batchnorm1.running_mean", "b21.batchnorm1.running_var", "b21.batchnorm3.weight", "b21.batchnorm3.bias", "b21.batchnorm3.running_mean", "b21.batchnorm3.running_var", "b22.batchnorm1.weight", "b22.batchnorm1.bias", "b22.batchnorm1.running_mean", "b22.batchnorm1.running_var", "b22.batchnorm3.weight", "b22.batchnorm3.bias", "b22.batchnorm3.running_mean", "b22.batchnorm3.running_var", "b23.batchnorm1.weight", "b23.batchnorm1.bias", "b23.batchnorm1.running_mean", "b23.batchnorm1.running_var", "b23.batchnorm3.weight", "b23.batchnorm3.bias", "b23.batchnorm3.running_mean", "b23.batchnorm3.running_var", "b24.batchnorm1.weight", "b24.batchnorm1.bias", "b24.batchnorm1.running_mean", "b24.batchnorm1.running_var", "b24.batchnorm3.weight", "b24.batchnorm3.bias", "b24.batchnorm3.running_mean", "b24.batchnorm3.running_var", "b25.batchnorm1.weight", "b25.batchnorm1.bias", "b25.batchnorm1.running_mean", "b25.batchnorm1.running_var", "b25.batchnorm3.weight", "b25.batchnorm3.bias", "b25.batchnorm3.running_mean", "b25.batchnorm3.running_var", "b26.batchnorm1.weight", "b26.batchnorm1.bias", "b26.batchnorm1.running_mean", "b26.batchnorm1.running_var", "b26.batchnorm3.weight", "b26.batchnorm3.bias", "b26.batchnorm3.running_mean", "b26.batchnorm3.running_var", "b27.batchnorm1.weight", "b27.batchnorm1.bias", "b27.batchnorm1.running_mean", "b27.batchnorm1.running_var", "b27.batchnorm3.weight", "b27.batchnorm3.bias", "b27.batchnorm3.running_mean", "b27.batchnorm3.running_var", "b28.batchnorm1.weight", "b28.batchnorm1.bias", "b28.batchnorm1.running_mean", "b28.batchnorm1.running_var", "b28.batchnorm3.weight", "b28.batchnorm3.bias", "b28.batchnorm3.running_mean",
"b28.batchnorm3.running_var", "b31.batchnorm1.weight", "b31.batchnorm1.bias", "b31.batchnorm1.running_mean", "b31.batchnorm1.running_var", "b31.batchnorm3.weight", "b31.batchnorm3.bias", "b31.batchnorm3.running_mean", "b31.batchnorm3.running_var", "b32.batchnorm1.weight", "b32.batchnorm1.bias", "b32.batchnorm1.running_mean", "b32.batchnorm1.running_var", "b32.batchnorm3.weight", "b32.batchnorm3.bias", "b32.batchnorm3.running_mean", "b32.batchnorm3.running_var", "b33.batchnorm1.weight", "b33.batchnorm1.bias", "b33.batchnorm1.running_mean", "b33.batchnorm1.running_var", "b33.batchnorm3.weight", "b33.batchnorm3.bias", "b33.batchnorm3.running_mean", "b33.batchnorm3.running_var", "b34.batchnorm1.weight", "b34.batchnorm1.bias", "b34.batchnorm1.running_mean", "b34.batchnorm1.running_var", "b34.batchnorm3.weight", "b34.batchnorm3.bias", "b34.batchnorm3.running_mean", "b34.batchnorm3.running_var", "b35.batchnorm1.weight", "b35.batchnorm1.bias", "b35.batchnorm1.running_mean", "b35.batchnorm1.running_var", "b35.batchnorm3.weight", "b35.batchnorm3.bias", "b35.batchnorm3.running_mean", "b35.batchnorm3.running_var", "b36.batchnorm1.weight", "b36.batchnorm1.bias", "b36.batchnorm1.running_mean", "b36.batchnorm1.running_var", "b36.batchnorm3.weight", "b36.batchnorm3.bias", "b36.batchnorm3.running_mean", "b36.batchnorm3.running_var", "b37.batchnorm1.weight", "b37.batchnorm1.bias", "b37.batchnorm1.running_mean", "b37.batchnorm1.running_var", "b37.batchnorm3.weight", "b37.batchnorm3.bias", "b37.batchnorm3.running_mean", "b37.batchnorm3.running_var", "b38.batchnorm1.weight", "b38.batchnorm1.bias", "b38.batchnorm1.running_mean", "b38.batchnorm1.running_var", "b38.batchnorm3.weight", "b38.batchnorm3.bias", "b38.batchnorm3.running_mean", "b38.batchnorm3.running_var", "b40.batchnorm1.weight", "b40.batchnorm1.bias", "b40.batchnorm1.running_mean", "b40.batchnorm1.running_var", "b40.batchnorm3.weight", "b40.batchnorm3.bias", "b40.batchnorm3.running_mean", "b40.batchnorm3.running_var", "b41.batchnorm1.weight", "b41.batchnorm1.bias",
"b41.batchnorm1.running_mean", "b41.batchnorm1.running_var", "b41.batchnorm3.weight", "b41.batchnorm3.bias", "b41.batchnorm3.running_mean", "b41.batchnorm3.running_var",
"b42.batchnorm1.weight", "b42.batchnorm1.bias", "b42.batchnorm1.running_mean", "b42.batchnorm1.running_var", "b42.batchnorm3.weight", "b42.batchnorm3.bias", "b42.batchnorm3.running_mean", "b42.batchnorm3.running_var", "b50.batchnorm1.weight", "b50.batchnorm1.bias", "b50.batchnorm1.running_mean", "b50.batchnorm1.running_var", "b50.batchnorm3.weight", "b50.batchnorm3.bias", "b50.batchnorm3.running_mean", "b50.batchnorm3.running_var", "b51.batchnorm1.weight", "b51.batchnorm1.bias", "b51.batchnorm1.running_mean", "b51.batchnorm1.running_var", "b51.batchnorm3.weight", "b51.batchnorm3.bias", "b51.batchnorm3.running_mean", "b51.batchnorm3.running_var".
Unexpected key(s) in state_dict: "b10.batchnorm.weight", "b10.batchnorm.bias", "b10.batchnorm.running_mean", "b10.batchnorm.running_var", "b10.batchnorm.num_batches_tracked", "b11.batchnorm.weight", "b11.batchnorm.bias", "b11.batchnorm.running_mean", "b11.batchnorm.running_var", "b11.batchnorm.num_batches_tracked", "b12.batchnorm.weight", "b12.batchnorm.bias", "b12.batchnorm.running_mean", "b12.batchnorm.running_var", "b12.batchnorm.num_batches_tracked", "b13.batchnorm.weight", "b13.batchnorm.bias", "b13.batchnorm.running_mean", "b13.batchnorm.running_var", "b13.batchnorm.num_batches_tracked", "b14.batchnorm.weight", "b14.batchnorm.bias", "b14.batchnorm.running_mean", "b14.batchnorm.running_var", "b14.batchnorm.num_batches_tracked", "b20.batchnorm.weight", "b20.batchnorm.bias", "b20.batchnorm.running_mean", "b20.batchnorm.running_var", "b20.batchnorm.num_batches_tracked", "b21.batchnorm.weight", "b21.batchnorm.bias", "b21.batchnorm.running_mean", "b21.batchnorm.running_var", "b21.batchnorm.num_batches_tracked", "b22.batchnorm.weight", "b22.batchnorm.bias", "b22.batchnorm.running_mean", "b22.batchnorm.running_var", "b22.batchnorm.num_batches_tracked", "b23.batchnorm.weight", "b23.batchnorm.bias", "b23.batchnorm.running_mean", "b23.batchnorm.running_var", "b23.batchnorm.num_batches_tracked", "b24.batchnorm.weight", "b24.batchnorm.bias", "b24.batchnorm.running_mean", "b24.batchnorm.running_var", "b24.batchnorm.num_batches_tracked", "b25.batchnorm.weight", "b25.batchnorm.bias", "b25.batchnorm.running_mean", "b25.batchnorm.running_var", "b25.batchnorm.num_batches_tracked", "b26.batchnorm.weight", "b26.batchnorm.bias", "b26.batchnorm.running_mean", "b26.batchnorm.running_var", "b26.batchnorm.num_batches_tracked", "b27.batchnorm.weight", "b27.batchnorm.bias", "b27.batchnorm.running_mean", "b27.batchnorm.running_var", "b27.batchnorm.num_batches_tracked", "b28.batchnorm.weight", "b28.batchnorm.bias", "b28.batchnorm.running_mean", "b28.batchnorm.running_var", "b28.batchnorm.num_batches_tracked", "b31.batchnorm.weight", "b31.batchnorm.bias", "b31.batchnorm.running_mean", "b31.batchnorm.running_var", "b31.batchnorm.num_batches_tracked", "b32.batchnorm.weight", "b32.batchnorm.bias", "b32.batchnorm.running_mean", "b32.batchnorm.running_var", "b32.batchnorm.num_batches_tracked", "b33.batchnorm.weight", "b33.batchnorm.bias", "b33.batchnorm.running_mean", "b33.batchnorm.running_var", "b33.batchnorm.num_batches_tracked", "b34.batchnorm.weight", "b34.batchnorm.bias", "b34.batchnorm.running_mean", "b34.batchnorm.running_var", "b34.batchnorm.num_batches_tracked", "b35.batchnorm.weight", "b35.batchnorm.bias", "b35.batchnorm.running_mean", "b35.batchnorm.running_var", "b35.batchnorm.num_batches_tracked", "b36.batchnorm.weight", "b36.batchnorm.bias", "b36.batchnorm.running_mean", "b36.batchnorm.running_var", "b36.batchnorm.num_batches_tracked", "b37.batchnorm.weight", "b37.batchnorm.bias", "b37.batchnorm.running_mean", "b37.batchnorm.running_var", "b37.batchnorm.num_batches_tracked", "b38.batchnorm.weight", "b38.batchnorm.bias", "b38.batchnorm.running_mean", "b38.batchnorm.running_var", "b38.batchnorm.num_batches_tracked", "b40.batchnorm.weight", "b40.batchnorm.bias", "b40.batchnorm.running_mean", "b40.batchnorm.running_var", "b40.batchnorm.num_batches_tracked", "b41.batchnorm.weight", "b41.batchnorm.bias", "b41.batchnorm.running_mean", "b41.batchnorm.running_var", "b41.batchnorm.num_batches_tracked", "b42.batchnorm.weight", "b42.batchnorm.bias", "b42.batchnorm.running_mean", "b42.batchnorm.running_var", "b42.batchnorm.num_batches_tracked", "b50.batchnorm.weight", "b50.batchnorm.bias", "b50.batchnorm.running_mean", "b50.batchnorm.running_var", "b50.batchnorm.num_batches_tracked", "b51.batchnorm.weight", "b51.batchnorm.bias", "b51.batchnorm.running_mean", "b51.batchnorm.running_var", "b51.batchnorm.num_batches_tracked".
size mismatch for b10.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b10.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b10.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b10.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b11.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b11.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b11.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b11.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b12.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b12.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b12.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b12.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b13.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b13.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b13.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b13.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b14.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b14.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b14.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b14.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b20.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b20.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b20.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b20.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b21.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b21.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b21.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b21.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b22.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b22.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b22.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b22.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b23.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b23.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b23.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b23.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b24.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b24.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b24.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b24.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b25.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b25.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b25.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b25.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b26.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b26.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b26.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b26.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b27.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b27.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b27.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b27.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b28.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b28.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b28.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b28.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b31.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b31.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b31.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b31.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b32.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b32.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b32.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b32.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b33.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b33.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b33.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b33.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b34.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b34.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b34.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b34.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b35.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b35.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b35.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b35.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b36.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b36.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b36.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b36.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b37.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b37.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b37.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b37.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b38.batchnorm2.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b38.batchnorm2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b38.batchnorm2.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b38.batchnorm2.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b40.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b40.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b40.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b40.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for b41.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b41.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b41.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b41.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b42.batchnorm2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b42.batchnorm2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b42.batchnorm2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b42.batchnorm2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for b51.batchnorm2.weight: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b51.batchnorm2.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b51.batchnorm2.running_mean: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for b51.batchnorm2.running_var: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([4]).

Not sure whats going on.

Answer 1 · 2020-09-13T10:40:47.000Z

Hi @awesomeroks.
There was a problem with the Batchnorm layers in the original implementation so I pushed a fix yesterday. The given weights work for the previous implementation.
Please train your own model from scratch (I will be glad if you can share the weights after that so I will add those here).

Answer 2 · 2020-09-13T10:41:47.000Z

Thanks!

I realized it just a few hours ago, and used the previous commit.