TianhongDai/hindsight-experience-replay

All process update the network and then sync the grad?

nizhihao opened this issue · 1 comments

Hi, I have the doubt.
In this distributed RL, because of OS scheduling, every process will have the near state and do the same thing?
so all process will update the network and then sync the grad together? like sync algorithm A2C?
thanks very much.

Yes, it can be refereed as using a very large batch size for the training.