The inconsistent performance on COLLAB dataset

Question

The inconsistent performance on COLLAB dataset

ZhuYun97 opened this issue 2 years ago · 7 comments

Thanks for your fabulous codes, but I meet some problems while reproducing. After running 5 runs(sh scripts/run_graph.sh COLLAB 0 using seed: 0,1,2,3,4), the performance on COLLAB dataset is 78.78±0.46 which remains a large margin with the reported value(80.32±0.46).
I check the model configuration in the Appendix and find the hidden size of GIN is 512(but 256 in the config.yaml). After changing the hidden size to 512, the performance(78.70±0.32) is still inferior to the paper.
Is there anything I missing? hope for your reply.

Answer 1 · 2022-10-25T06:02:07.000Z

@ZhuYun97 Thanks for your interest. The scaling coefficient alpha_l should be set to 1 in COLLAB, while it is 2 by default now. Sorry to miss this hyper-parameter in configs.yml. We have updated configs.yml. Hope this help.

Answer 2 · 2022-10-25T07:53:00.000Z

Thanks for your instant reply.

Answer 3 · 2022-10-26T11:42:13.000Z

Sorry to bother you again. The performance on MUTAG dataset is 86.47±1.81, but I find that all the hyperparameters are the same with Appendix. Is there any other things I should change?

Answer 4 · 2022-10-26T13:52:05.000Z

The appendix shows only several key hyper-parameters due to space limits. You can refer to configs.yml for complete information. And our experiments are conducted in NVIDIA 2080 Ti. Do you run the experiments using the hyper-parameters provided in configs.yml? And maybe you could provide more details.

Answer 5 · 2022-10-26T13:56:21.000Z

I use the latest config.yml. And I run on NVIDIA 3090(I also test on NVIDIA 1080Ti, but get similar results).

MUTAG:
  num_hidden: 32
  num_layers: 5
  lr: 0.0005
  weight_decay: 0.00
  mask_rate: 0.75
  drop_edge_rate: 0.0
  max_epoch: 20
  encoder: gin
  decoder: gin
  activation: prelu
  loss_fn: sce
  scheduler: False  
  pooling: sum
  batch_size: 64
  alpha_l: 2
  replace_rate: 0.1
  norm: batchnorm
  in_drop: 0.2
  attn_drop: 0.1
  alpha_l: 2

Answer 6 · 2022-10-26T15:09:29.000Z

After I change the device to NIVIDIA 2080 Ti, the performance gets 88.29±1.22. I guess the MUTAG dataset is too small, so the experiments are not so stable.

Answer 7 · 2022-10-27T13:16:32.000Z

Indeed, MUTAG is too small, so the variance is larger and the results are not as stable as other results. And previously, we also found that some randomness can still exist in NIVIDIA 3090 Ti even when we use the same random seed. So you can just run the experiments on 2080 Ti.