difference between parallel_runner and episode_runner?
kkkclearlove opened this issue · 1 comments
kkkclearlove commented
Hello! I found that many SMAC series papers emphasize the use of 8 parallel environment sampling in their experimental part, and in the open source code of some papers, it is found that many researchers cannot reproduce the results of the paper through parallel_runner and batch_size_run == 8, but need to use episode_runner to reproduce the results, I would like to ask what is the difference between the two and why the better performance in some scenarios comes from episode_runner?
I am looking forward to your reply, thanks!
hijkzzz commented
if batch_size_run == 8:
QMIX + RMSProp == DNN Underfitting
Thus we need QMIX + Adam + TD(\lambda) + Large Batch Size to make DNN fit the samples better.
elif batch_size_run == 1:
Please refer to https://github.com/marlbenchmark/off-policy (QMIX with some tricks from PPO)