Unable to reproduce the result

Question

Unable to reproduce the result

ankitw497 opened this issue 6 years ago · 5 comments

I am using the given pretrained caffe model but getting a euclidean loss of much more than mentioned in the paper.Please look into my code and tell me where I am making mistake.
loading caffe model ,doing a forward pass to get the output.
CaffeModel.zip

Answer 1 · 2019-05-29T04:33:38.000Z

Here is the piece of code

import caffe
import numpy as np
import csv
import cv2
from PIL import Image
from scipy.io import loadmat

def Predict(tmp_dict): #load the dictionary

caffe.set_mode_cpu()

MODEL_FILE = "/mnt/disks/d/mitgaze/code/Model/itracker_deploy.prototxt"
PRETRAINED = "/mnt/disks/d/mitgaze/code/Model/itracker_iter_92000.caffemodel"

#load the mean Image
image_mean_face  = loadmat("/mnt/disks/d/mitgaze/code/Model/mean_face_224.mat", squeeze_me=True, struct_as_record=False)['image_mean']
image_mean_left  = loadmat("/mnt/disks/d/mitgaze/code/Model/mean_left_224.mat", squeeze_me=True, struct_as_record=False)['image_mean']
image_mean_right = loadmat("/mnt/disks/d/mitgaze/code/Model/mean_right_224.mat", squeeze_me=True, struct_as_record=False)['image_mean']


net = caffe.Classifier(MODEL_FILE, PRETRAINED)
list_of_score=[]
with open("/mnt/disks/d/mitgaze/code/Result/loss_withMeanSubs.txt","w") as fout:
    for sample in tmp_dict:
        
        facegridpath = list(tmp_dict[sample]['facegrid'].items())
        FaceGrid=np.loadtxt(facegridpath[0][1])

        dotInfopath = list(tmp_dict[sample]['dotInfo'].items())
        dotInfo=np.loadtxt( dotInfopath[0][1])

        EucledianLoss=np.zeros((len(dotInfo),1));
        Xpre=np.zeros((len(dotInfo),1));
        Ypre=np.zeros((len(dotInfo),1));
        Xtrue=np.zeros((len(dotInfo),1));
        Ytrue=np.zeros((len(dotInfo),1));

    for image in tmp_dict[sample]:
        
        if 'facegrid'!=image and 'dotInfo'!=image:
            
            appleFace,appleLeftEye,appleRightEye = tmp_dict[sample][image].items()

            IMAGE_FILE1=appleLeftEye[1]
            IMAGE_FILE2=appleRightEye[1]
            IMAGE_FILE3=appleFace[1]

          
            image_left = Substract_mean(IMAGE_FILE1,image_mean_left)
            image_right= Substract_mean(IMAGE_FILE2,image_mean_right)
            image_face = Substract_mean(IMAGE_FILE3,image_mean_face)
            facegrid   = np.reshape(FaceGrid[int(image)]  ,(1,625,1,1))

        
            net.blobs["image_left"].data[...]= image_left
            net.blobs["image_right"].data[...]= image_right
            net.blobs["image_face"].data[...]= image_face
            net.blobs["facegrid"].data[...]= facegrid
            
            pred = net.forward();
            Xpre[int(image)][0]= pred['fc3'][0][0];
            Ypre[int(image)][0]= pred['fc3'][0][1];

            Xtrue[int(image)][0]=dotInfo[int(image)][0];
            Ytrue[int(image)][0]=dotInfo[int(image)][1];

            EucledianLoss[int(image)][0]=Loss(Xtrue[int(image)][0],Xpre[int(image)][0],Ytrue[int(image)][0],Ypre[int(image)][0],);
            fout.write(sample+","+image+","+str(Xtrue[int(image)][0])+","+str(Xpre[int(image)][0])+","+str(Ytrue[int(image)][0])+","+str(Ypre[int(image)][0])+","+str(EucledianLoss[int(image)][0])+"\n")
            
Averageerorr=Average(EucledianLoss,len(image)-2)
return(EucledieanLoss)


# Accuracy

def Loss( X_t,X_p ,Y_t,Y_p):
Loss=math.sqrt((X_t-X_p)**2 + (Y_t-Y_p)**2)

return Loss

# Average Error

def Average(EucledianLoss,NumofImages):
Average=((np.sum(EucledianLoss))/NumofImages)

return Average

#Mean Substraction
# substract the mean

def load_image( filename ):
img = Image.open( filename )
img.load()
data = np.asarray( img, dtype="float32" )
# print("input",data.shape)
return data

def Substract_mean( filename,mean_image_array):
img=load_image(filename)
Substract_img=(img - mean_image_array)/255
Substract_img=np.reshape(Substract_img,(1,3,224,224))
#print("subtract",Substract_img.shape)
return Substract_img`

Answer 2 · 2019-06-04T16:20:52.000Z

Hi, I do not really use caffe anymore so I cannot test your code. You can either try to use the Pytorch version or debug your solution by checking that all data have correct range (0-255 -> 0-1) when subtracting mean etc. Also your error formula may be wrong because you are averaging square distances and rooting at the end which is not a linear mean of the errors. Try rooting the values before accumulation. Should make a difference if the variance is large.

Answer 3 · 2019-06-11T22:03:47.000Z

Hi Petr,
I have the same question: what is the error reported in Table 2 in the paper. For instance, the real label is (x = 0, y = 0), and the prediction is (x = 1, y = 1). What is the error? Is the error = MSELoss(real, pred) = 1, or error = sqrt(2) = 1.414.
I thought the error should be 1.414, but the PyTorch code seems to count the error as 1.
I rewrite a function to calculate the evaluation metric (Euclidean distance on screen) and then re-evaluate the provided checkpoint. The result is about 2.4 cm on-screen error, which is slightly worse than the reported result. I guess my understanding is right, but the PyTorch code only evaluate the L2 error.

Thanks!

Answer 4 · 2019-08-23T20:07:24.000Z

Hi, check that the error that you compute is square rooted before being averaged.

Answer 5 · 2019-08-23T20:15:01.000Z

Also, if the error vector is [1,1] then the correct error is 1.41. Please note that there is a difference between the training loss (which is just L2 without rooting) and the accuracy (as reported in the paper) which is being rooted (= Euclidean distance).