diningphil/gnn-comparison

Results don't match - am I running something in the wrong way?

levyofek opened this issue · 2 comments

Hey!
I'm trying to reproduce Enzymes's results with GIN.
Running on CPU.
Following the instructions, I'm running on the corresponding virtual environment, prepared the dataset, copied the data_splits into the dataset's folder and ran:
python Launch_Experiments.py --config-file config_GIN.yml --dataset-name ENZYMES --result-folder results --debug

I get results for 10 folds and overall (assessment_results.json):
avg_TR_score 91.61179697196341
std_TR_score 7.0978242989793205
avg_TS_score 69.22222256130644
std_TS_score 4.850862399140947

which is higher than what is documented in the paper (~59.6 on the test).

Can you assist in understanding if I'm doing something wrong plz?

Thx!

That’s very strange! The first thing to check would be to see if the dataset has been modified since 2019 when we ran experiments. This is the exact same code we used to get the numbers, so you should not get such a difference. Could you please check?

Also, are you using the newer libraries of PyG? There might have been extra changes compared to the original implementation of GIN.

Have you installed the original environment of the paper or the newer one (see README)?

Closing for inactivity.