ResNet architecture
mileyan opened this issue · 2 comments
Thanks for your great work!
I am a little bit confusing about the resnet architecture.
https://github.com/clovaai/overhaul-distillation/blob/master/ImageNet/models/ResNet.py#L55
The last ReLU has been removed from all res-blocks, not just the last res-block in each res-layer. Will this hurt model performance? Also, have you retrained the model?
Thanks for your help.
Hi
As you mentioned, I removed last ReLU for pre-ReLU feature.
overhaul-distillation/ImageNet/models/ResNet.py
Lines 94 to 95 in f5b9929
Instead, I added ReLU to the front of ResBlock.
overhaul-distillation/ImageNet/models/ResNet.py
Lines 40 to 42 in f5b9929
The last ReLU is just moved to the first ReLU of ResBlock.
And I also added a ReLU for last layer block.
-
My version
overhaul-distillation/ImageNet/models/ResNet.py
Lines 141 to 156 in f5b9929
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)
return x
As a result, our network is the same as the original.
My code loads parameters from official PyTorch model.
overhaul-distillation/ImageNet/models/ResNet.py
Lines 12 to 18 in f5b9929
But, performance is not changed.
It is very clear. Thanks for your help.