anibali/dsntnn

Get confidence of prediciton per regressed coordinate

simonhessner opened this issue · 8 comments

Hi,

first of all: I really like the DSNT layer. It works perfectly and the idea is really cool :)

However, what would be really useful for my application, is a way to get the confidence of my model that the coordinate regressed by DSNT is correct. When looking at the heatmaps, the confidence should be very high when the heatmap values are close to 1 at the predicted position and close to 0 everywhere else. The confidence should be low when it there is a large patch in the heatmap that has a low confidence and then in the middle of this patch there is one point with a slightly higher confidence.

So I guess what I am looking for is a way to derive the standard deviation of a Gaussian which has its center at the position that is predicted by DSNT. And then I would have to transform the deviation to a confidence value between 0 and 1.

In the end I want to have n coordinates that have 3 values: x,y, confidence

Is there an easy way to do this with the functions already provided by DSNT?

Best,
Simon

I think I could use _divergence_reg_losses with different sigma_t values and mu_t as the regressed coordinate and then see which sigma_t has the lowest loss. That would be kind of a brute force approach, but better than nothing

If you look at the variance_reg_losses function you can find an example of calculating the variance of heatmaps:

dsntnn/dsntnn/__init__.py

Lines 233 to 262 in 4f20f5a

def variance_reg_losses(heatmaps, sigma_t):
"""Calculate the loss between heatmap variances and target variance.
Note that this is slightly different from the version used in the
DSNT paper. This version uses pixel units for variance, which
produces losses that are larger by a constant factor.
Args:
heatmaps (torch.Tensor): Heatmaps generated by the model
sigma_t (float): Target standard deviation (in pixels)
Returns:
Per-location sum of square errors for variance.
"""
# mu = E[X]
values = [normalized_linspace(d, dtype=heatmaps.dtype, device=heatmaps.device)
for d in heatmaps.size()[2:]]
mu = linear_expectation(heatmaps, values)
# var = E[(X - mu)^2]
values = [(a - b.squeeze(0)) ** 2 for a, b in zip(values, mu.split(1, -1))]
var = linear_expectation(heatmaps, values)
heatmap_size = torch.tensor(list(heatmaps.size()[2:]), dtype=var.dtype, device=var.device)
actual_variance = var * (heatmap_size / 2) ** 2
target_variance = sigma_t ** 2
sq_error = (actual_variance - target_variance) ** 2
return sq_error.sum(-1, keepdim=False)

I haven't tried using it this way myself, but it could be possible to use this sort of calculation as a proxy for confidence.

Very nice, have not seen this function. This seems to work pretty good in my first tests. Just have to find a nice way to transform it into values between 0 and 1, but I'll figure it out. Thank you!

@anibali, I use the function variance_reg_losses() with to get confidence. But the confidence is 0.05 very low with unnormalized_heatmaps and 3.9954e+08 very high with normalized_heatmaps. How to get normal landmark confidence? if I make traditional gauss heatmap like figure2b in the paper, the heatmap value is [0, 1], and I will try it.

You will need to calibrate the outputs from variance_reg_losses yourself. Keep in mind that it should have an inverse relationship with confidence (a large value indicates low confidence). It should be used with normalised heatmaps.

Note that this is not something that I have tried myself, so you will need to do some experimenting.

@anibali Thanks your replay. I have cablibrated the outputs and experimented. In the example of basic_usage.md, add three different points [24, 27] [28, 32] [32, 15]. But the three points output values have big difference, eg, 0.2, 0.02, 0.12. Is it normal? Cablibrate by abs(output - output_mean)/(max_output - min_output) and is it right? If only one point as input, this cablibrate method doesnot work.

If it can make gauss like function gaussian(img, pt, sigma) ? And I have a try.

You just have to experiment with your actual data. There is no way to tell whether the spread of numbers is meaningful for three points, you have to observe what the values are like for both good and bad predictions to see whether they are correlated.

ok, thanks, I have a try.