Baseline results on GRU and LSTM on MemoryGym
subho406 opened this issue · 2 comments
Hi,
Thanks for the amazing implementation. I was wondering if would be possible to release the baseline implementations along with the hyperparameters used for GRU and LSTM on the Memory Gym environment (https://openreview.net/pdf?id=jHc8dCx6DDr)? I hoping to use MemoryGym for my thesis work and it will be extremely helpful. Thanks!
Hello!
The results are produced using neroRL (develop branch).
We are currently updating our GRU baseline repository to support Memory Gym. The develop branch should be functional, but we still need to reproduce our results, which is the last step toward merging it to the main branch. This should be done within the next two weeks. So feel free to use neroRL for training now and you can use the other repository to follow our implementation concept more easily.
Also we found better hyperparameters for MMGrid and MPGrid using optuna that we have just implemented in neroRL (develop):
MM Grid | MP Grid | |
---|---|---|
Worker | 32 | 32 |
Worker steps | 512 | 512 |
Epochs | 3 | 3 |
Num Minibatches | 8 | 8 |
gamma | 0.995 | 0.995 |
lamda | 0.95 | 0.95 |
value loss coefficient | 0.5 | 0.5 |
advantage normalization | batch | none |
max grad norm | 0.25 | 0.25 |
clip range | 0.1 | 0.2 |
init learning rate | 2.50E-04 | 2.75E-04 |
fina learning rate | 1.00E-05 | 1.00E-05 |
init entropy coefficient | 0.0001 | 0.001 |
fina entropy coefficient | 0.000001 | 0.000001 |
RECURRENCE | ||
num layers | 1 | 1 |
layer type | GRU | GRU |
sequence length | -1 | -1 |
hidden state size | 512 | 512 |
residual | TRUE | FALSE |
updates | 5000 | 10000 |