cora dataset node classification performance under mettack

Question

cora dataset node classification performance under mettack

cxw-droid opened this issue 3 years ago · 6 comments

cxw-droid commented 3 years ago

Hi,

Thanks for sharing your paper's code.

I tried to reproduce the cora dataset test results for classification accuracy under mettack as shown in Table 2 of your paper. The performance is much lower. (e.g. accuracy 0.8033 for ptb_rate 0.05, 0.7425 for ptb_rate 0.1, 0,6831 for ptb_rate 0.15)
I used the default setting and ran the command as python train.py --dataset cora --attack meta --ptb_rate 0.1 --epoch 1000 (the ptb_rate changed as required).
So can you please tell me what setting do you use or how can I reproduce a similar result?
The "Run the code " in your README file is broken. python train.py --dataset polblogs --attack meta --ptb_rate 0.15 --epoch 1000 output error AssertionError: ProGNN splits only cora, citeseer, pubmed, cora_ml

Answer 1 · 2021-06-10T00:37:32.000Z

Hi thanks for your interest,

To reproduce the performance, please run the scripts in scripts folder as mentioned in README. For example,

sh scripts/meta/cora_meta.sh

To test performance under different severity of attack, you can change the ptb_rate in those bash files.

Thanks for pointing it out. I have just fixed the bug. You may now reclone DeepRobust and install it.

git clone https://github.com/DSE-MSU/DeepRobust.git
cd DeepRobust
python setup_empty.py install

Let me know if you have more questions.

Answer 2 · 2021-06-12T02:41:33.000Z

Thanks for your quick response.

I tested the code using sh scripts/meta/cora_meta.sh on dataset cora. I tested at different ptb_rate. The test results are as follows,

ptb_rate -> accuracy
0.05 → 0.8295
0.10 → 0.7822
0.15 → 0.7631
0.2 → 0.5739
0.25 → 0.5282

It seems the accuracy at ptb_rate 0.2 and 0.25 drops significantly and is much lower than expected. I changed only the ptb_rate in the shell script to run the test.

Answer 3 · 2021-06-12T04:36:34.000Z

Hi, thanks for the feedback!

I just found that the hyper-parameters provided in the cora-meta scripts are not exactly the same as those used in the experiments. Please change lr=1e-3 to lr=5e-4 and epoch=400 to epoch=1000. I have also updated them in the script.

Answer 4 · 2021-06-12T14:05:16.000Z

Btw, below are the results I got for cora-meta: (for one seed)
0.05 -> 0.8300
0.10 -> 0.7943
0.15 -> 0.7606
0.20 -> 0.7369
0.25 -> 0.6942

Answer 5 · 2021-06-15T21:21:56.000Z

Thanks, I got a similar result.

But when I tried with a couple of different seeds, the results were 0.02~0.03 lower for ptb_rate 0.2 and 0.25. Do you have the code/script to run multiple rounds of tests to get the mean and std of the accuracy?

Answer 6 · 2021-06-15T22:54:47.000Z

Hi,

It could happen that some results are lower because the variance is relatively higher than other perturbation rates. I would suggest to run more seeds.
I cannot find the script now but it basically running the given command for 10 times (note that all the experiments are evaluated under seeds from 10 to 19). The python script does something like this

# filename: run.py
import os 
seeds = list(range(10, 20))
for seed in seeds:
    command = "python train.py --dataset cora --seed %s" %seed
    os.system(command)

Then you run

python run.py >> cora.out

Thecora.out file stores all the output from the program and we can simply write a script to extract the lines with the string "Test set results:" and obtain the results to calculate the mean/std.