sthoduka/imreg_fmt

How to calculate a number for similarity

mostafafarzaneh opened this issue · 3 comments

Hey. Thanks for the effort on this. It saved me lots of time.

I need a number to tell me how much the images are similar. For example, OpenCV phaseCorrelate returns a double(response) that can be used as a similarity indication.
I was thinking of using this similar to phaseCorrelation response, but I cannot figure out the right way to use it.
From OpenCV doc:

the response parameter is computed as the sum of the elements of r within the 5x5 centroid around the peak location. It is normalized to a maximum of 1 (meaning there is a single peak) and will be smaller when there are multiple peaks.

response: Signal power within the 5x5 centroid around the peak, between 0 and 1 (optional).

hey @mostafafarzaneh
it does look like openCV computes response as the sum of a defined neighbourhood divided by the size of the image.
See here and here. M*N doesn't seem to be the exact image size though, but rather the value computed using getOptimalDFTSize. They mention in one of the comments that the maximum size of the peak is roughly equal to M*N. I'm not sure why that is, but I assume there's a mathematical explanation for that, and can be tested.

Thanks.

I should mention that I use an image of size 512*512. SogetOptimalDFTSize wont change the dimension.

At first, I divided the sum at getCenterOfMass by M*N([here (https://github.com/sthoduka/imreg_fmt/blob/master/src/image_dft.cpp#L165)). But I got a number above 1.

Then I used the OpenCV weightedCentroid function and pass it the exact parameter as passed to getCenterOfMass. It returned the exact results as getCentreOfMass.

But if I use OpenCV phaseCorrelation method, the returned col and row are similar(not the same) as your phaseCorrelation method and the response is under 1 which is correct.

I think there is a difference between OpenCV and FFTW in how they calculate DFT. Because in the OpenCV phaseCorrelate the minimum number in the matrix after calculating FFT stuff(the C Mat), is a 3 digit negative but the same matrix in your implementation(abs) has a minimum of a tiny positive number. The maximum number is the same though. But I can be wrong.

One difference seems to be that here we are taking the absolute value of the cross power spectrum. Unless I missed it, it doesn't seem like the openCV version does that. That would explain the negative numbers from OpenCV and tiny positive number here. That shouldn't affect the location of the peak I guess, but would affect your similarity score.