facebookresearch/meru

Making entailment loss learnable?

Opened this issue · 1 comments

ez2rok commented

Hello! Great work on this paper!

I was wondering if you at all considered making the entailment loss learnable, similar to the curvature or visual / textual alphas? What went into your decision of manually choosing the entailment loss?

Cheers!

ez2rok commented

To clarify, I was specifically referring to the entail_weight or λ parameter of the MERU model. However, I see that the authors did experiment with different λ parameters. To quote from the paper:

Some λ > 0 is necessary to induce partial order structure, however, quantitative performance is less sensitive to the choice of λ ∈ [0.01, 0.3]; Higher values of λ strongly regularize against the contrastive loss and hurt performance.

It seems that the authors did not require the model to learn λ / the entail_weight because λ > 0.3 generally hurt performance and λ <= 0.3 had a qualitative, not quantitate performance and thus would be difficult to learn.