General Loss function discussion
KayARS opened this issue · 8 comments
I had some general questions about the nll_with_covariances you use for your loss function.
As it seems there are a lot of other people that have similar questions right now on your repo.
Could you share with us, why you've chosen this function? Did you get it from somewhere, or did you do it by yourself?
Whats the general idea behind it? I'm seeing a lot of log and exponent calls, but I figure there is some mathematical formula behind this that I can't find anywhere.
The interval of the loss seems to be [-inf, inf], which seems a little bit weird to me, because it's hard to tell when the model is actually converging.
(I guess negative is good, but I can't say for sure)
Any help would be greatly appreciated!
I think the author of MultiPath++ use the loss defined in MultiPath. And the author of this repo may use similar idea in the code.
But the operation "torch.logdet" may create 'nan' or 'inf' in code and raise error in the following code.
Yes if seen that, too.
Did you find a good solution when the determinant of the covariance matrix is <=0 ?
I'm currently just setting the determinant to a small positive value, (that the log doesn't return -inf or nan)
but I'm not sure if that is an optimal approach.
I haven't found a reasonable solution. @stepankonev , Could you please help us?
I've been recommended to do following (which is generally helpful when using the log):
torch.log(torch.det(covariance_matrices).unsqueeze(-1)+1e-6)
instead of:
torch.logdet(covariance_matrices).unsqueeze(-1)
Might not be the most optimal, but it works for me and might help you, too.
It seems to be a general instability problem
Hi everyone! You can read about the loss function in the technical report, actually it is nothing but NLL that was used in many papers
I want to know why there is a log_softmax(confidences) instead of classification loss.
I've been recommended to do following (which is generally helpful when using the log):
torch.log(torch.det(covariance_matrices).unsqueeze(-1)+1e-6)
instead of:torch.logdet(covariance_matrices).unsqueeze(-1)
Might not be the most optimal, but it works for me and might help you, too. It seems to be a general instability problem
I've been recommended to do following (which is generally helpful when using the log):
torch.log(torch.det(covariance_matrices).unsqueeze(-1)+1e-6)
instead of:torch.logdet(covariance_matrices).unsqueeze(-1)
Might not be the most optimal, but it works for me and might help you, too. It seems to be a general instability problem
I find here, covariance matrix is semi-definite, torch.det(covariance matrix)
>= 0
Could you help me explain what this mean?
errors = coordinates_delta.permute(0, 1, 2, 4, 3) @ precision_matrices @ coordinates_delta
I find the slides, maybe it's helpful