malllabiisc/WordGCN

Detailed experimental parameters settings

Zeyu-Liang opened this issue · 1 comments

Hi,
Thank you for your paper, as well as releasing the code.
I follow your source code with your default settings, we obtain a poor result. The experimental setup as shown below:

2019-09-22 11:04:39,606 - test_embeddings_22_09_2019_11:04:39 - [INFO] - {'embed_loc': None, 'gcn_layer': 1, 'batch_size': 512, 'sample': 0.0001, 'lr': 0.001, 'config_dir': './config/', 'dropout': 1.0, 'max_epochs': 5, 'total_sents': 56974869, 'num_neg': 25, 'log_dir': './log/', 'side_int': 10000, 'log_db': 'aaai_runs', 'emb_dir': './embeddings/', 'opt': 'adam', 'onlyDump': False, 'restore': False, 'l2': 0.0, 'context': False, 'gpu': '0', 'seed': 1234, 'name': 'test_embeddings_22_09_2019_11:04:39', 'embed_dim': 300}

  | WS353S | WS353R | SimLex999 | RW | AP | Battig | BLESS | SemEval2012 | MSR|
SynGCN | 73.2 | 45.7 | 45.5 | 33.7 | 69.3 | 45.2 | 85.2 | 23.4 | 52.8
our imp. | 75.4 | 39.9 | 44.7 | 30.1 | 66.8 | 44.9 | 77.0 | 21.5 | 41.3

Where did I go wrong?

Hi @Zeyu-Liang,
The default hyperparams in syngcn.py were not the best ones (now I have updated them). Moreover, your trained embeddings were 100 dimensional whereas we report performance with 300 dimensional embeddings. Using the given hyperparams should give the reported performance on the benchmarks.

After completing the training, we found that training a few epochs with a lower learning rate with SGD gives some performance gain. For that you need to execute the following:

python syngcn.py -name test_embeddings_xx_xx_xx  -restore -opt sgd -lr 0.001 -l2 0.0 -epoch 5

Thanks