kilianFatras/JUMBOT

Quick question

Closed this issue · 2 comments

Hi. Thank you very much for sharing your code. I have 1 clarification doubt - I was confused whether we also need to backpropagate through the Sinkhorn algorithm. The repository code doesn't do that. Is this due to the Envelope theorem used in the papers [1] & [2]

Hi,
Good question ! The answer is : we don't need to do that ! Why ? Thanks to Danskin's theorem ! (https://en.wikipedia.org/wiki/Danskin%27s_theorem).

The entropic-regularized UOT has a unique solution, so we have a gradient with respect to the ground cost C. Furthermore, we have an unbiased estimator of the expectation of minibatch OT, so we can exchange gradients and expectations. Thus minimizing our empirical estimator leads to the minimum of the whole problem. This is justified with our second theorem.

Note that your suggestion would work. This was used to get a differentiable loss with a fixed budget of Sinkhorn iteration. But with a small budget you do not get the optimal solution, that is why I prefer using Danskin theorem. This is also what the original deepjdot algorithm did https://github.com/bbdamodaran/deepJDOT

Thanks a lot for the detailed & quick response.