Jensen–Shannon divergence

Question

Jensen–Shannon divergence

LamyaMohaned opened this issue 4 years ago · 2 comments

Hello,

I'm trying to understand Jensen–Shannon divergence, I still don't understand the math behind it, but someone asked me to investigate about it and Augmix because of this paragraph:

Alternatively, we can view each set as an empirical distribution and measure the distance between
them using Kullback-Leibler (KL) or Jensen-Shannon (JS) divergence. The challenge for learning
with KL or JS divergence is that no useful gradient is provided when the two empirical distributions
have disjoint supports or have a non-empty intersection contained in a set of measure zero.

from here: https://arxiv.org/pdf/1907.10764.pdf

Is this problem presented in Augmix?

Answer 1 · 2021-06-14T16:54:57.000Z

This is not a problem with AugMix since they share the same support and for all elements of the support, the probabilities are greater than zero.

Answer 2 · 2021-06-16T08:09:11.000Z

Thank you!