ox-vgg/vgg_face2

Loading Weights in Pytorch for Prediction of Image from Test Set (Classification)

Closed this issue · 2 comments

I get incorrect predictions with the pre-trained Resnet50_scratch model/weights - e.g. less than 1%, where it should be greather than 90% (confirmed with using same input image in implementations such as keras_vggface).

It looks like others have had issues as well with this as well. Is it possible to get example code to load an image, and do a successful prediction using the pretrained weights and image from training?

Below are the steps I have followed:

  1. Load image from disk, rotate channels from input RGB image to BGR, resize image to 255x255 using MTCNN, then subtract means mentioned in Readme.txt (The mean value of each channel is substracted for each pixel (mean vector [91.4953, 103.8827, 131.0912] in BGR order). Example below use center-crop vs MTCNN, but have also tried MTCNN with same results.
from PIL import Image
from torchvision import transforms

def rotate_channels(img):
    return Image.merge("RGB", (list(img.split()))[::-1])

rotate_channels_result = rotate_channels(Image.open(IMAGE_PATH))
center_crop=transforms.Compose([transforms.CenterCrop(224)])(rotate_channels_result)
center_crop_tensor=transforms.ToTensor()(center_crop)
x = (center_crop_tensor*255) - torch.Tensor([91.4953, 103.8827, 131.0912]).view((3,1,1))
  1. Run inference on resnet50 model, get scores (picking classification output from tuple, not feature output), and then get softmax probabilities.
MainModel = imp.load_source('MainModel', f'{MODEL_PATH}.py') 
resnet50_scratch = MainModel.resnet50_scratch(f'{WEIGHTS_PATH}.pth').cuda()

inference_image_tensor = x
inference_image_tensor = inference_image_tensor.unsqueeze(0).cuda()
classification_scores = resnet50_scratch(inference_image_tensor)[0]

top_values, top_indices = torch.max(classification_scores, 1)
percentage = torch.nn.functional.softmax(classification_scores, dim=1)[0] * 100
print(top_indices, percentage[top_indices[0]].item())
# Both incorrect class and low percentage score (less than 1%) have been the results so far.

Thank you for any help you can give, I am stumped.

Our provided model ends up with the feature embedding, we don't release the last classifier layer.

If you want to get that, simply fix all the model, pass the VGGFace2 dataset, and train the last classifier yourself.