Making entailment loss learnable?

Question

Making entailment loss learnable?

Opened this issue a year ago · 1 comments

Hello! Great work on this paper!

I was wondering if you at all considered making the entailment loss learnable, similar to the curvature or visual / textual alphas? What went into your decision of manually choosing the entailment loss?

Cheers!

Answer 1 · 2023-09-13T19:09:00.000Z

To clarify, I was specifically referring to the entail_weight or λ parameter of the MERU model. However, I see that the authors did experiment with different λ parameters. To quote from the paper:

Some λ > 0 is necessary to induce partial order structure, however, quantitative performance is less sensitive to the choice of λ ∈ [0.01, 0.3]; Higher values of λ strongly regularize against the contrastive loss and hurt performance.

It seems that the authors did not require the model to learn λ / the entail_weight because λ > 0.3 generally hurt performance and λ <= 0.3 had a qualitative, not quantitate performance and thus would be difficult to learn.