Question About feature collapse
humanpose1 opened this issue · 3 comments
Hi Chris,
Thanks for your work !
I just have a little question.
In your article, you said that the hardest triplet loss is prone to collapse... To mitigate the problem, you mix hardest triplet with randomly sampled triplet.
Do you know why is it prone to collapse ?
Sorry for the late response.
I think that the triplet loss in general is a more difficult loss to train with a lot more local minima that lead to collapse.
On that note, the triplet loss has two terms inside a max which are competing against each other. Like GAN training with competing terms, this is more prone to collapse as the gradient for minimizing the final loss can be interpreted as 1. minimize the positive distance, 2. maximize the negative distance, or simply 3. minimize both, if all the others are too difficult.
Whereas for the contrastive loss, there's no option 3. Just simply removing the third option forces the network to not collapse.
Thanks for your answer :)
Hi @humanpose1 and @chrischoy ,
I was wandering on the same question, it is nice to get that explanation.
But still, I don't understand what you mean by "mixing random triplet to the hardest triplet loss".
It will be nice if one of you can elaborate on that.
Thanks :)