Eromera/erfnet_pytorch

imagenet pretrained model

boundles opened this issue · 9 comments

Hi, could you provide train scripts of IMAGENET pretrained model?

I have just uploaded the script and encoder's definition for pretraining imagenet model. Please note that I haven't debugged it thoroughly so if you find any error please report it. Hope it helps!

There was a typo in the architecture definition so it was not working until now. I have corrected it and tested that the script works with mainstream Pytorch 0.3.

ok, thanks a lot

I have another question that is pretrained imagenet model with data normalization( mean and std ) , but your erfnet decoder image data has't been normalized. Does it reduce your performance?

Hi, what do you mean that decoder image data hasn't been normalized? Only the RGB input should be normalized and this is the same for encoder and decoder training.

But it seems that you comment the normalize code block in train/main.py?

        input = ImageOps.expand(input, border=(transX,transY,0,0), fill=0)
        target = ImageOps.expand(target, border=(transX,transY,0,0), fill=255) #pad label filling with 255
        input = input.crop((0, 0, input.size[0]-transX, input.size[1]-transY))
        target = target.crop((0, 0, target.size[0]-transX, target.size[1]-transY))   

        #TODO future: additional augments
        #CenterCrop(256)  
        #Normalize([.485, .456, .406], [.229, .224, .225]),

Hi! Yes, that is "legacy" code that remains from other code snippets. Those mean and std values correspond to ImageNet mean and std. However, when training in cityscapes you should normalize to cityscapes data, but in my experiments this has resulted in ~71.1% IoU instead of the 72.2% that you can get when you don't normalize. Therefore, by default in the code there is no normalization.

If you train decoder normalizing with that legacy line (ImageNet mean and std) you'll get around 71.8% IoU.

ok, I got it, thanks a lot.