mniepert/mmkb

Asking for the numerical dataset of FB15k-num, FB15k-237-num (KBLRN paper)

Opened this issue · 5 comments

Could you please share the FB15k-num, FB15k-237-num used in the paper of "KBLRN: End-to-End Learning of Knowledge Base Representation with Latent, Relational, and Numerical Features"?

I tried to reproduce the experiment result of Table 4 in this paper, but I can not create the same set of valid and testing data as Table 1. (".. where numerical features are never used for the triples.")

Thank you very much and I am looking forward to hearing from you.

Best regards,

Phuc

Regarding the numerical data, the stats mentioned in the paper is "This resulted in 116 different numerical features and 12,826 entities".
It seems that the [File 1] in your GitHub repo only has 12,493 entities. Is it the [File 1] was used in this paper?
[File 1]: https://github.com/nle-ml/mmkb/blob/master/FB15K/FB15K_NumericalTriples.txt

Hi Phuc,

i) you are right. Actually there are numerical features for 12,493 entities, and not 12,826. I had a dictionary of that size, but only 12,493 entities had values for this type of features.

ii) The only difference of FB15k-num and FB15k-237-num with respect to the normal versions is the validation and test set. There, we only perform link prediction in those triples where both head and tail have numerical attributes. We wanted to isolate the effect of the numerical expert, and thus we wanted to perform link prediction in that subset. However, the training set is the standard one.

Alberto

Hi Alberto,
Thank you for your respond.

Regarding ii), could you share the subset of validation and test set of FB15k-num, and FB15k-237-num? I mean the validation set and testing set.
FB15k: Valid: 5156 triples, Test: 6012 triples
FB15k-237-num: Valid: 1058 triples, Test: 1215 triples

I did remove those triples as you mentioned on the validation and testing of FB15k-237, and FB15k, but I seem that I got a different list of triples.
The following is the FB15k-237-num triples:
What I got: Valid: 9384 triples, Test: 10934 triples

Thank you.
Phuc

Hi,

note that to keep a triple both head and tail need to have at least one numerical feature in common.

Please find attached the test set of FB15k-num and FB15k-237-num. Sorry but I dont find the validation files right now... I moved outside NEC and it is hard now for me to locate these files.

Alberto

NF_15ktest_triples.txt
NF_15k237test_triples.txt

Hi Alberto,

It is great. (Test file is OK for me to run the verify experiments)

Thank you very much for your time and effort.

Phuc