awslabs/dgl-ke

RESCAL Score

nomisto opened this issue · 1 comments

Hello,

We are using dglke for baseline evaluation of our dataset. However RESCAL outperforms other approaches by quite a lot: more than 20% more hits@1, which seems a little odd.

I've looked into your code and I've discovered that the negative score functions for heads and tails do not produce the same score for the same triple. What suprised me further, is that the create_neg(false) (Negative tails) function produces the same score as the inverse of the triple, so I guess the error lies in some wrong matrix multiplications.

How certain are you that the RESCAL implementation is correct/could you review the RESCAL implementation?

Here is a quick code to reproduce:

from dglke.models.pytorch.score_fun import RESCALScore

scorer = RESCALScore(300,300)

torch.manual_seed(0)
heads = torch.randn(1, 300)
tails = torch.randn(1, 300)
relations = torch.randn(1, 90000)

neg_head_score = scorer.create_neg(True)(heads, relations, tails, 1, 1, 1)
neg_tail_score = scorer.create_neg(False)(heads, relations, tails, 1, 1, 1)


class FakeEdge(object):
    def __init__(self, head_emb, rel_emb, tail_emb):
        self._hobj = {}
        self._robj = {}
        self._tobj = {}
        self._hobj['emb'] = head_emb
        self._robj['emb'] = rel_emb
        self._tobj['emb'] = tail_emb

    @property
    def src(self):
        return self._hobj

    @property
    def dst(self):
        return self._tobj

    @property
    def data(self):
        return self._robj

triple_score = scorer.edge_func(FakeEdge(heads, relations, tails))

print(f"Score via create_neg(true): {neg_head_score}")
print(f"Score via create_neg(false): {neg_tail_score}")
print(f"Score via edge_func: {triple_score}")


inv_triple_score = scorer.edge_func(FakeEdge(tails, relations, heads))
print(f"Score of inverse triple via edge_func: {inv_triple_score}")
Score via create_neg(true): tensor([[[-49.9360]]])
Score via create_neg(false): tensor([[[-347.1286]]])
Score via edge_func: {'score': tensor([-49.9360])}

Score of inverse triple via edge_func: {'score': tensor([-347.1284])}

TransR seems to have the same issue. Here it seems that the negative score functions (and their prepare functions) for heads and tails got interchanged.

For RESCAL: Transposing the relation tensor relations = th.transpose(relations.view(-1, relation_dim, entity_dim), 1, 2) prior to matrix multiplication here seems to give the right scores, however I am not sure if this is correct.

def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
tails = th.transpose(tails, 1, 2)
heads = heads.unsqueeze(-1)
relations = relations.view(-1, self.relation_dim, self.entity_dim)
tmp = th.matmul(relations, heads).squeeze(-1)
tmp = tmp.reshape(num_chunks, chunk_size, hidden_dim)
return th.bmm(tmp, tails)
return fn