Can not reproduce the result on kinship dataset

Question

Can not reproduce the result on kinship dataset

Lee-zix opened this issue 6 years ago · 4 comments

Dear authors, I run the experiment on kinship dataset with the config file "kinship-rs.sh", while I cannot reproduce the result in the paper, this is my config file:

------------------------------Config-------------------------------------
use_action_space_bucketing="True"
bandwidth=400
entity_dim=200
relation_dim=200
history_dim=200
history_num_layers=3
num_rollouts=20
num_rollout_steps=2
bucket_interval=10
num_epochs=1000
num_wait_epochs=400
num_peek_epochs=2
batch_size=128
train_batch_size=128
dev_batch_size=32
learning_rate=0.001
baseline="n/a"
grad_norm=5
emb_dropout_rate=0.3
ff_dropout_rate=0.1
action_dropout_rate=0.9
action_dropout_anneal_interval=1000
reward_shaping_threshold=0
beta=0.05
relation_only="False"
beam_size=128

------------------------------Result with ConvE-------------------------------------
Dev set performance:
Hits@1 = 0.5655430711610487
Hits@3 = 0.8398876404494382
Hits@5 = 0.9119850187265918
Hits@10 = 0.952247191011236
MRR = 0.7152329056775273
Hits@1 = 0.7397003745318352
Hits@3 = 0.8838951310861424
Hits@5 = 0.9250936329588015
Hits@10 = 0.9550561797752809
MRR = 0.8201877377481808
Test set performance:
Hits@1 = 0.7262569832402235
Hits@3 = 0.8975791433891993
Hits@5 = 0.9348230912476723
Hits@10 = 0.9720670391061452
MRR = 0.8186909747405656

The tap between my result and the result in the paper is very large, can you give my some advice on how to reprocude the result on kinship! Thanks very much!!!!

Answer 1 · 2019-04-07T18:23:01.000Z

Happy to help. First of all, can you also post the performance of the ConvE model that is used for reward shaping?

Answer 2 · 2019-04-08T01:47:04.000Z

Thanks very much for the timely reply. The results of the ConvE model are below:
------------------------------Result of ConvE-------------------------------------
Hits@1 = 0.5908239700374532
Hits@3 = 0.8895131086142322
Hits@5 = 0.9485018726591761
Hits@10 = 0.9765917602996255
MRR = 0.7458009978599914
Hits@1 = 0.795880149812734
Hits@3 = 0.9269662921348315
Hits@5 = 0.9569288389513109
Hits@10 = 0.9765917602996255
MRR = 0.866499525780971
Test set performance:
Hits@1 = 0.7905027932960894
Hits@3 = 0.9459962756052142
Hits@5 = 0.9720670391061452
Hits@10 = 0.9851024208566108
MRR = 0.8696373694340752

I trained the ConvE model with your default config file. And the performance seems similar to the performance in the paper.

Answer 3 · 2019-04-08T07:21:23.000Z

Yes, the ConvE performance looks right.

How many epochs did you train the model for? Did you modify any early stopping condition provided in the code?
I trained the kinship reward shaping model from scratch using a clone of this repo. The model performance at epoch 236 is as shown below.

Hits@1 = 0.591
Hits@3 = 0.879
Hits@5 = 0.926
Hits@10 = 0.961
MRR = 0.740
Hits@1 = 0.772
Hits@3 = 0.910
Hits@5 = 0.934
Hits@10 = 0.965
MRR = 0.848
Test set performance:
Hits@1 = 0.774
Hits@3 = 0.919
Hits@5 = 0.950
Hits@10 = 0.975
MRR = 0.852

And more epochs still mildly improves the performance (the maximum # training epochs in the configuration file is set to 1000).

I did notice that the model learns really slowly once the dev set MRR (correct evaluation) reached 0.69. For example, the dev set MRR (correct evaluation) was at 0.692 at epoch 50, and it slowly crawled to 0.74 in the next ~200 epochs. I also noticed oscillations during training, i.e. the dev set MRR does not smoothly move to 0.74 but oscillates between 0.71-0.73 for many epochs.

Answer 4 · 2019-04-09T03:12:12.000Z

I download a new version from git and produce the result in the paper. Maybe there is some changes in my code and i will check it. Thanks very much for your kindly help!!!