clovaai/overhaul-distillation

ResNet architecture

mileyan opened this issue · 2 comments

Thanks for your great work!

I am a little bit confusing about the resnet architecture.
https://github.com/clovaai/overhaul-distillation/blob/master/ImageNet/models/ResNet.py#L55
The last ReLU has been removed from all res-blocks, not just the last res-block in each res-layer. Will this hurt model performance? Also, have you retrained the model?

Thanks for your help.

bhheo commented

Hi

As you mentioned, I removed last ReLU for pre-ReLU feature.

out += residual
#out = self.relu(out)

Instead, I added ReLU to the front of ResBlock.

def forward(self, x):
x = F.relu(x)
residual = x

The last ReLU is just moved to the first ReLU of ResBlock.

And I also added a ReLU for last layer block.

  • My version

    def forward(self, x):
    x = self.conv1(x)
    x = self.bn1(x)
    x = self.relu(x)
    x = self.maxpool(x)
    x = self.layer1(x)
    x = self.layer2(x)
    x = self.layer3(x)
    x = F.relu(self.layer4(x))
    x = self.avgpool(x)
    x = x.view(x.size(0), -1)
    x = self.fc(x)
    return x

  • TorchVision

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)

        return x

As a result, our network is the same as the original.
My code loads parameters from official PyTorch model.

model_urls = {
'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
}

But, performance is not changed.

It is very clear. Thanks for your help.