MorvanZhou/pytorch-A3C

How come different performance?

tessavdheiden opened this issue · 5 comments

Hi Morvan,

Why do I get totally different performance (see file attached)?

Best,
Tessa
a3c_continious

Hi,

How about changing the 'UPDATE_GLOBAL_ITER' more than 5?

It was helpful to me and I got following performance with UPDATE_GLOBAL_ITER=10.

image

Hi, but when I ran this code, the moving average reward is always below -1000 for the continuous situation, do you know what kind of problem it could be? (the 'UPDATE_GLOBAL_ITER' has already been set to 10) The performance of the discrete situation is very bad as well.

Hi,

Here is another trial.

Try 'torch.nn.utils.clip_grad_norm_(lnet.parameters(), 20)' in utils.py

image

It helped me to reduce performance differences.

Hi,

Here is another trial.

Try 'torch.nn.utils.clip_grad_norm_(lnet.parameters(), 20)' in utils.py

image

It helped me to reduce performance differences.

Thank you! I'll take a try.

Hi,

Here is another trial.

Try 'torch.nn.utils.clip_grad_norm_(lnet.parameters(), 20)' in utils.py

image

It helped me to reduce performance differences.

Hi,I meet a trouble when I train another A3C.
After some time, all the networks always output the same action.
I tried the "torch.nn.utils.clip_grad_norm_(lnet.parameters(), 20)", it doesn't work.
It may be that during the training process, the network tries many times, but does not reap the reward.
Do you have any ideas about this problem?