vzhong/e3

Difference between Equation 14 in paper and its implementation

Yifan-Gao opened this issue · 0 comments

In equation 14 of E3 paper, the summary vector C is computed based on the extracted spans (summation from s_i to e_i of each span).

However, the implementation here considers all tokens for the summary vector C:

e3/model/entail.py

Lines 59 to 61 in 0c6b771

inp_attn_score = self.inp_attn_scorer(self.dropout(out['bert_enc'])).squeeze(2) - (1-out['input_mask'].float()).mul(1e20)
inp_attn = F.softmax(inp_attn_score, dim=1).unsqueeze(2).expand_as(out['bert_enc']).mul(self.dropout(out['bert_enc'])).sum(1)
out['clf_scores'] = self.class_clf(self.dropout(inp_attn))