Request to GraphMVP hyper-parameter

Question

Request to GraphMVP hyper-parameter

Data-reindeer opened this issue 3 years ago · 7 comments

Recently, I try to reproduce the classification results with codes provided in the link.
In pre-training, I adopted the given parameters in submit_pre_training_GraphMVP.sh and default parameters for those are not mentioned in the bash file. In fine-tuning, I just ran 3 seed [0, 1, 2].(I notice that results are also reported with 3 seeds in Table 1 in your paper, but maybe with different seeds).
However, the results I got are quite different, especially on clintox, hiv and tox21 datasets. Below are my results:

	bbbp	tox21	bace	clintox	sider	hiv	muv
0	0.7129	0.7382	0.7771	0.6117	0.6043	0.7615	0.7539
1	0.68	0.7424	0.8006	0.5657	0.5916	0.7389	0.7544
2	0.7033	0.7339	0.7741	0.5843	0.6108	0.719	0.7289
avg	0.69873	0.73817	0.78393	0.587233	0.60223	0.7398	0.7457

I think this might be caused by different seeds that I used. Could you provide full parameter settings to reproduce results reported in the Table 1 in your paper. That would be helpful.

Answer 1 · 2022-03-23T10:25:54.000Z

Hi,

Thank you for your interest.
The downstream fine-tunings are using seede 0-2.
Can you specify what hyper-parameters for GraphMVP are you using for this table? And also how you get the test ROC values?

Answer 2 · 2022-03-23T11:20:29.000Z

Thank you for your timely reply!
Below are parameters that I used:

# about seed and basic info
parser.add_argument('--seed', type=int, default=42)
parser.add_argument('--runseed', type=int, default=0)
parser.add_argument('--device', type=int, default=0)

# about dataset and dataloader
parser.add_argument('--input_data_dir', type=str, default='../datasets/GEOM')
parser.add_argument('--dataset', type=str, default='bace')
parser.add_argument('--num_workers', type=int, default=8)

# about training strategies
parser.add_argument('--split', type=str, default='scaffold')
parser.add_argument('--batch_size', type=int, default=256)
parser.add_argument('--epochs', type=int, default=100)
parser.add_argument('--lr', type=float, default=0.001)
parser.add_argument('--lr_scale', type=float, default=1)
parser.add_argument('--decay', type=float, default=0)

# about molecule GNN
parser.add_argument('--gnn_type', type=str, default='gin')
parser.add_argument('--num_layer', type=int, default=5)
parser.add_argument('--emb_dim', type=int, default=300)
parser.add_argument('--dropout_ratio', type=float, default=0)
parser.add_argument('--graph_pooling', type=str, default='mean')
parser.add_argument('--JK', type=str, default='last')
parser.add_argument('--gnn_lr_scale', type=float, default=1)
parser.add_argument('--model_3d', type=str, default='schnet', choices=['schnet'])

# for AttributeMask
parser.add_argument('--mask_rate', type=float, default=0.15)
parser.add_argument('--mask_edge', type=int, default=0)

# for ContextPred
parser.add_argument('--csize', type=int, default=3)
parser.add_argument('--contextpred_neg_samples', type=int, default=1)

# for SchNet
parser.add_argument('--num_filters', type=int, default=128)
parser.add_argument('--num_interactions', type=int, default=6)
parser.add_argument('--num_gaussians', type=int, default=51)
parser.add_argument('--cutoff', type=float, default=10)
parser.add_argument('--readout', type=str, default='mean', choices=['mean', 'add'])
parser.add_argument('--schnet_lr_scale', type=float, default=0.1)

# for 2D-3D Contrastive CL
parser.add_argument('--CL_neg_samples', type=int, default=1)
parser.add_argument('--CL_similarity_metric', type=str, default='EBM_dot_prod',
choices=['InfoNCE_dot_prod', 'EBM_dot_prod'])
parser.add_argument('--T', type=float, default=0.1)
parser.add_argument('--normalize', dest='normalize', action='store_true')
parser.add_argument('--no_normalize', dest='normalize', action='store_false')
parser.add_argument('--SSL_masking_ratio', type=float, default=0)

parser.add_argument('--AE_model', type=str, default='VAE', choices=['AE', 'VAE'])
parser.set_defaults(AE_model='AE')

# for 2D-3D AutoEncoder
parser.add_argument('--AE_loss', type=str, default='l2', choices=['l1', 'l2', 'cosine'])
parser.add_argument('--detach_target', dest='detach_target', action='store_true')
parser.add_argument('--no_detach_target', dest='detach_target', action='store_false')
parser.set_defaults(detach_target=True)

# for 2D-3D Variational AutoEncoder
parser.add_argument('--beta', type=float, default=1)

# for 2D-3D Contrastive CL and AE/VAE
parser.add_argument('--alpha_1', type=float, default=1)
parser.add_argument('--alpha_2', type=float, default=1)

# for 2D SSL and 3D-2D SSL
parser.add_argument('--SSL_2D_mode', type=str, default='AM')
parser.add_argument('--alpha_3', type=float, default=0.1)
parser.add_argument('--gamma_joao', type=float, default=0.1)
parser.add_argument('--gamma_joaov2', type=float, default=0.1)

# about if we would print out eval metric for training data
parser.add_argument('--eval_train', dest='eval_train', action='store_true')
parser.add_argument('--no_eval_train', dest='eval_train', action='store_false')
parser.set_defaults(eval_train=True)

# about loading and saving
parser.add_argument('--input_model_file', type=str, default='../output/pretraining_model.pth')
parser.add_argument('--output_model_dir', type=str, default='../output')

# verbosity
parser.add_argument('--verbose', dest='verbose', action='store_true')
parser.add_argument('--no_verbose', dest='verbose', action='store_false')
parser.set_defaults(verbose=False)

Answer 3 · 2022-03-23T11:27:07.000Z

Hi @Data-reindeer

So it seems that you are using the default hyper-parameter? If so, it may not be the optimal one.

Answer 4 · 2022-03-23T11:47:09.000Z

Hi @chao1224
Yes, I follow the settings in the submit_pre_training_GraphMVP.sh.
So could you provide the parameter settings that can get the optimal results (reported in the Table 1)?
That will be interesting to see how parameters can effect model results.

Answer 5 · 2022-03-23T12:02:29.000Z

Hey @Data-reindeer

That might be the issue. We are still cleaning up the repo and scripts, will let you know once it's ready.

Notice. One thing we didn't highlight in the paper is that the hyper-parameters are robust when using masking ratio=0.3. If you are in a rush to get the results, you may as well try it first.

Answer 6 · 2022-03-23T12:10:52.000Z

Hi @chao1224
That will be very helpful. Thanks a lot for your reply !

Regards,
reindeer

Answer 7 · 2022-03-23T12:25:13.000Z

Hi @Data-reindeer,

We just quickly uploaded a noisy version to the google drive. Now it should include:

pretraining.out file. The first line is the hyper-parameter.
pretraining_model.pth.
Downstream log files.
We include masking ratio=0.15 and 0.3, corresponding to the last two rows in Table 2.

A cleaner version will be updated together with the second clean-up commit.