SthPhoenix/InsightFace-REST

Compare Faces

AmrRahmy opened this issue ยท 7 comments

Sorry if this is not the right place. I needed some clarification if possible. I don't know if there is a discord or slack or irc or whatsapp group.

I was able to get the current github version to work, the latest release zip file seems outdated by the way, it fails to get the required dependencies, doesn't auto download the models, and there was a git url issue.

"/extract" works, I get back "vec" array, I am guessing this is the extracted features or embedding of the face, and can be used to compare faces. I thought there would be an endpoint for comparing two faces. If you don't mind can you guide me how i can compare two vecs? a small sample or guide me in the right direction to do a simple 1 to 1 comparison.

I had a question as well, if I use let's say glintr100 to extract the "vec" array, if i change the model, would I need to re-extract the "vec" using the other model, or I can compare any two "vec" arrays regardless of the model used?

Hi!

  1. Yes it's really outdated, glad you managed to build from master.

  2. I haven't added separate endpoint for comparing faces since it's trivial to compare faces after retrieving their embeddings. Though I might add it later for demo purposes. You can compare vecs in python like this:

    similarity = (1. + np.dot(vec1, vec2)) / 2. 
    

    Actual similarity is computed as np.dot(vec1, vec2) which is equivalent of cosine similarity in case of normalized vectors (API returns normalized vectors) other operations are used to move resulting value from [-1;1] range to [0;1] range which i think is more intuitive for end user.

  3. No, you can only compare embeddings computed by the exact same model, actually even if you train 2 models with the same training set and hyperparameters you'll get two different models, which are not interchangeable.

Thank you for the quick response.

So I did a quick test, not entirely sure the vec is 1d array, I don't have the docker setup right but if it is, dot product will double it, quick online search, dot(vec1, vec2)/(norm(vec1)*norm(vec2)) should do it.

import numpy.matlib
import numpy as np

from numpy import dot
from numpy.linalg import norm

vec1 = np.array([1,0,0,1])
vec2 = np.array([1,0,0,1])

similarity = np.dot(vec1, vec2)
normalized = (1. + np.dot(vec1, vec2)) / 2.
cos_sim = dot(vec1, vec2)/(norm(vec1)*norm(vec2))

print('similarity',similarity)
print('normalized',normalized)
print('cos_sim',cos_sim)

Output:
similarity 2
normalized 1.5
cos_sim 0.9999999999999998

Hi! As I have said dot product is equivalent to cosine similarity when vecs are normalized, so in your test you shoud do it like this:

import numpy as np

from numpy import dot
from numpy.linalg import norm

vec1 = np.array([1, 0, 0, 1])
vec2 = np.array([1, 0, 0, 1])

normed_vec1 = vec1 / norm(vec1)
normed_vec2 = vec2 / norm(vec2)

dot_prod = np.dot(normed_vec1, normed_vec2)
normalized = (1. + dot_prod) / 2.
cos_sim = dot(vec1, vec2) / (norm(vec1) * norm(vec2))

print('similarity', dot_prod)
print('normalized', normalized)
print('cos_sim   ', cos_sim)

Which output is:

similarity 0.9999999999999998
normalized 0.9999999999999999
cos_sim    0.9999999999999998

Also take note that embeddings returned by API are already normalized.

That makes sense. I didn't have the output of the api call ready, so I assumed the input.
Thank you for the help, it's clear.

And some more info about (1. + dot_prod) / 2.

For input vecs like:

vec1 = np.array([1, 0, 0, 1])
vec2 = np.array([1, 0, 0, 0])

Result would be:

similarity 0.7071067811865475
normalized 0.8535533905932737

And for vecs

vec1 = np.array([1, 0, 0, 1])
vec2 = np.array([-1, 0, 0, 0])

Results are:

similarity -0.7071067811865475
normalized 0.14644660940672627

Similarities in [0,1] range seems more readable for me.

Hello, I implemented the 1 to 1 face comparison mechanism to your repo as you described in this issues. I compare the vectors in the output of the extract function as dot_prod, normalized and cos_sim. but at 3 similarity rates, my similarity score with 5 different people was higher than 95%. this is the opposite of the original insightface repo and the face_recognition repo. is this normal?

Hello, I implemented the 1 to 1 face comparison mechanism to your repo as you described in this issues. I compare the vectors in the output of the extract function as dot_prod, normalized and cos_sim. but at 3 similarity rates, my similarity score with 5 different people was higher than 95%. this is the opposite of the original insightface repo and the face_recognition repo. is this normal?

Hi! Score for different people shouldn't be above 95%, if you have used normalization in [0,1] range as I have recommended above.