Wrong cosine similarity results on face recognition

Question

Wrong cosine similarity results on face recognition

cucabr opened this issue 5 years ago · 10 comments

First of all I would like to thank for the code, very well done and written.

I used your code for face recognition (and using the model template provided by you, thank you) . And in my tests everything went as planned.
But when I used it on my ip camera (640x480 resolution), the results for my face were confusing. Where can I be wrong? Can the results be wrong because the image I want to compare is in grayscale?

The face I'd like to compare (face 1): Me

The face of someone else (face 2): Other person

My face (which I'd like to match w/ the face 1): Me in ip camera

Other person "cosine similarity": 0.4480170012
Me "cosine similarity": 0.6674099863

I wish you would answer me even if you could not help me. I do not know what to do anymore.

Thank youuu!!

serengil commented 5 years ago

Resolved

Answer 1 · 2019-05-09T14:42:16.000Z

Are you using this project? Besides please crop the face you would like to compare. This image should just include face but your image is not in this form.

Answer 2 · 2019-05-09T15:02:09.000Z

Yes, I am using this code exactly. I'm using parts of your code.
And I am also cutting faces (I use haarcascade_frontalface_alt2.xml cascade) and my code includes margins because I realized that the results are much more accurate as I cut the face just above the shoulders. I cut the face of the captured image and the loaded image.

The code of "cut face":

def crop_face(self, imgarray, section, margin=40, size=224):
        """
        :param imgarray: full image
        :param section: face detected area (x, y, w, h)
        :param margin: add some margin to the face detected area to include a full head
        :param size: the result image resolution with be (size x size)
        :return: resized image in numpy array with shape (size x size x 3)
        """
        img_h, img_w, _ = imgarray.shape
        if section is None:
            section = [0, 0, img_w, img_h]
        (x, y, w, h) = section
        margin = int(min(w,h) * margin / 100)
        x_a = x - margin
        y_a = y - margin
        x_b = x + w + margin
        y_b = y + h + margin
        if x_a < 0:
            x_b = min(x_b - x_a, img_w-1)
            x_a = 0
        if y_a < 0:
            y_b = min(y_b - y_a, img_h-1)
            y_a = 0
        if x_b > img_w:
            x_a = max(x_a - (x_b - img_w), 0)
            x_b = img_w
        if y_b > img_h:
            y_a = max(y_a - (y_b - img_h), 0)
            y_b = img_h
        cropped = imgarray[y_a: y_b, x_a: x_b]
        resized_img = cv2.resize(cropped, (size, size), interpolation=cv2.INTER_AREA)
        resized_img = np.array(resized_img)
        return resized_img, (x_a, y_a, x_b - x_a, y_b - y_a)

If I use this image (1080x720 total) for comparison, everything is OK and the result is this:

Other person "cosine similarity": 0.6618626714
Me "cosine similarity": 0.1134036183

Thank you so much for answering me!

Answer 3 · 2019-05-09T18:04:00.000Z

This is less than the threshold value (0.30). So, the issue is resolved, right?

I will add crop face command while reading dictionary images.

Answer 4 · 2019-05-09T18:18:40.000Z

No, it's not resolved. Sorry if I could not be clearer.

I have 4 images in my tests.

Face1: The face I'd like to compare (taken by the camera phone): Me
Face2: The face of someone else (taken by the ip camera): Other person
Face3: My face on the ip camera (which I'd like to match w/ the face 1): Me in ip camera
Face4: My face on (taken by webcam): Me

I preloaded Face1 and Face2.

The problem is:

When I compare Face 3 with the preloaded faces Face 1 and Face 2 images, the result goes wrong. The results are:

Other person "cosine similarity": 0.4480170012
Me "cosine similarity": 0.6674099863

But when I compare Face4 with the preloaded faces Face 1 and Face 2, the result IS OK. The results are:

Other person "cosine similarity": 0.6618626714
Me "cosine similarity": 0.1134036183

I wonder what I might be doing wrong. I assume that the error is happening because the Face 3 is in grayscale. I tried everything but I can not get a satisfactory result in this case. And I need to make face recognition work under the conditions of using a camera ip.

Thank you very much for your attention!

Answer 5 · 2019-05-13T18:06:20.000Z

Would you please try to load face 1 and face 2 in gray scale? Besides, please confirm that face 1, 2, 3 and 4 are already cropped.

Answer 6 · 2019-05-13T18:20:13.000Z

Yes, all the images are being cropped and I'm loading them in grayscale.
I suspect that the problem may be caused by some noise in the image.

I wrote a function that uses OpenCV features to try to clean the images. The results have improved but the cosine similarity is still greater than 0.20 and I still have false positives.
I do not know what else to do.

def filter_image(self,face_img):
        #convert to gray and return to 3 channels
        gray = cv2.cvtColor(face_img,cv2.COLOR_RGB2GRAY)
        face_img = cv2.cvtColor(gray,cv2.COLOR_GRAY2RGB)
        
        #sharpen a
        kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
        face_img = cv2.filter2D(face_img, -1, kernel)
        
        #equalizeHist (each channel)
        rgb_planes = cv2.split(face_img)
        result_planes = []
        for plane in rgb_planes:
            plane_hist = cv2.equalizeHist(plane)
            result_planes.append(plane_hist)
        face_img = cv2.merge(result_planes)
        
        #Denoising
        face_img = cv2.fastNlMeansDenoisingColored(face_img,None,10,10,7,21)
        
        return face_img

Maybe I do not know how to work with Keras and Tensorflow.

Answer 7 · 2019-05-13T19:12:52.000Z

If possible, reopen this issue. Even though there are no problems with your code, maybe other people can help me and so the code is also improved to be used on ip cameras (w/ noise).

Answer 8 · 2019-05-13T19:16:41.000Z

It seems that ip cams have trouble that's why it is reopened

Answer 9 · 2019-11-03T21:18:32.000Z

I solved the problem using higher resolution camera images. If anyone can predict images using low-resolution surveillance cameras (640x480), please enter the code here.