baidu/puck

Why normlize feature vector before searching in tinker?

xiaozxiong opened this issue · 2 comments

I found that there is a normlization operation before searching in tinker.

const float* feature = normalization(context.get(), request->feature);

What is the purpose of this operation? And when I used the default parameter whether_norm=true, I got a recall@100 of almost zero. After I changed it to whether_norm=false, the recall@100 was correct. Could you offer me some possible explanations?

Thank you!

The default distance calculation method is cosine similarity. The returned distance value is obtained by applying the transformation '2 - 2 * cosine similarity' to the cosine similarity value between two vectors.

For other distance calculation methods, it is necessary to update the value of whether_norm to false.

Thank you for your reply.